indexing

When the user wants to fix indexing issues from Search Console, use noindex, or implement Google Indexing API. Also use when the user mentions "fix indexing,"…

INSTALLATION
npx skills add https://github.com/kostja94/marketing-skills --skill indexing
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

$27

Cause

Action

Low quality, duplicate, off-topic

Improve content, fix duplicates, set correct canonical

Static assets (CSS/JS)

See below

Feed, share URLs with params

Usually OK to ignore; or noindex, canonical to main URL

Important content pages

Use URL Inspection, verify canonical/internal links/sitemap, Request indexing

Static Assets (Next.js / Vercel)

Vercel adds unique dpl= params to static assets per deploy, creating many "Crawled - currently not indexed" URLs.

Do

Don't

Keep robots.txt allowing /_next/

Do not block /_next/ (breaks CSS/JS loading). See robots-txt

Accept static assets in GSC as expected

Do not block /_next/static/css/ or ?dpl=

Use X-Robots-Tag for static assets

CSS/JS should not be indexed; no SEO impact

Static assets in "Crawled - currently not indexed" is normal and expected.

Other Issue Types (from GSC Coverage)

Issue

Fix

Excluded by «noindex» tag

Remove noindex if accidental; keep if intentional

Blocked by robots.txt

See robots-txt; remove Disallow for important paths

Redirect / 404

Fix URL or add redirect

Duplicate / Canonical

Set correct canonical; usually OK

Soft-404

Page returns 200 but content says "not found" or empty—Google may treat as 404. Fix: return 404 status for truly missing pages; or add real content for 200 pages

Soft-404

A soft-404 occurs when a page returns HTTP 200 but the content indicates the page doesn't exist (e.g. "Page not found" message, empty state). Google may treat it as 404 and exclude from index.

Fix

When

Return 404

Page truly doesn't exist; use proper 404 status

Add content

Page is intentional (e.g. empty search results); ensure substantive content or use noindex

Redirect

If URL moved, use 301 to correct destination

Noindex Usage

  • How: metadata.robots = { index: false } or <meta name="robots" content="noindex"> or X-Robots-Tag
  • Rationale: Not all site content should be indexed; noindex is a valid choice for many pages
  • Caution: Avoid noindex on important content pages
  • With robots.txt: robots.txt = path-level crawl control; noindex = page-level index control. Do not block noindex pages in robots.txt—crawlers must access the page to read the directive. Use both: robots for /admin/, /api/; noindex for /login/, /thank-you/, etc. See robots-txt for when to use which.
  • nofollow ≠ noindex: nofollow controls link equity only; it does not prevent indexing. To exclude from search, use noindex. See page-metadata for meta robots implementation.

Page Types That Typically Need Noindex

Category

Page Types

Typical Meta

Reason

Auth &#x26; Account

Login, Signup, Password reset, Account dashboard

Login: noindex,nofollow; Signup: noindex,follow

No search value; login indexed = security risk; signup follow allows crawl of Privacy/Terms links

Admin &#x26; Private

Admin, Staging, Test pages, Internal tools

noindex,nofollow

Not for public; avoid discovery

Conversion Endpoints

Thank-you, Confirmation, Checkout success, Download gate

noindex,follow

Post-conversion; no SERP value; allow link equity

System &#x26; Utility

404, Internal search results, Faceted/filter URLs

noindex,follow or noindex,nofollow

Thin/duplicate; 404 = error state

Legal

Privacy, Terms, Cookie Policy (optional)

Often noindex,follow

Low-value indexed; reduces clutter

Duplicate &#x26; Thin

Printer-friendly, Parameter URLs, Near-duplicate

noindex,follow or canonical

Duplicate content; canonical preferred when possible

Low-Value

Media kit, Feedback board (external), Thin press

noindex or index for brand queries

Case-by-case

noindex,follow vs noindex,nofollow: Use noindex,follow for most cases—excludes from SERP but allows link equity. Use noindex,nofollow only for login (security), staging, or temporary test pages.

Page Removal Decision Framework

When intentionally removing a page from the web, choose the method based on whether a relevant alternative exists and whether the page should remain accessible:

Scenario

Method

Rationale

Has a closely related replacement page

301 redirect

Preserves accumulated link signals and user flow

Content merged into a new page

301 redirect

Direct old URL to the new canonical location

Permanently deleted, no alternative

410 Gone

Explicitly signals permanent removal to search engines

Deleted, uncertain if permanent

404 Not Found

Safe default; can reinstate later if needed

Still accessible but should not be indexed

noindex

Page remains available to users; excluded from SERP

Before removing: Check the URL's search traffic, backlinks, internal links, and conversion value. If the page has value, consider updating or merging rather than removing.

Common mistakes:

  • 404-ing pages that have relevant alternatives (wastes accumulated signals)
  • Redirecting all deleted pages to the homepage (breaks user intent)
  • Creating redirect chains (A → B → C) instead of direct redirects
  • Removing pages without cleaning up internal links pointing to them
  • Using robots.txt to block noindex pages (crawler must access the page to read the noindex directive)

Post-removal cleanup:

  • Remove deleted URLs from XML sitemap; update and resubmit
  • Update internal links to point directly to the final URL (avoid relying on redirects)
  • For 301 redirects, ensure the target URL is in the sitemap
  • In GSC, use URL Inspection to verify important pages; use Removals tool for temporary quick-hide (not permanent — use proper HTTP status or noindex)

Google Indexing API

Type

Typical use

JobPosting

Job boards

BroadcastEvent

Live platforms

Requirements: Enable Indexing API, create service account, add owner in Search Console, request quota (default 200 URLs/day).

Output Format

  • Action items: Prioritized fixes

Related Skills

  • google-search-console: Find and diagnose indexing issues in GSC
  • robots-txt: Path-level crawl control; when to use robots.txt vs noindex; do not block /_next/ or noindex pages
  • page-metadata: Meta robots implementation; noindex vs nofollow
  • xml-sitemap: Submit and maintain sitemap
  • indexnow: Faster indexing for Bing
  • canonical-tag: Resolve duplicate content
BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card