firecrawl

Web scraping, search, crawling, and browser automation with LLM-optimized markdown output. Supports six command modes: search for discovery, scrape for single URLs, map to locate subpages, crawl for bulk site sections, browser for interactive content, and download for offline archives Returns clean markdown formatted for LLM context windows; write results to .firecrawl/ directory to avoid redundant fetches and manage large outputs Includes escalation workflow: start with search or scrape, use map to find specific URLs within sites, switch to browser when content requires clicks, form fills, or login interactions Parallel job execution up to concurrency limit; check firecrawl --status for remaining credits and active job slots before running bulk operations

INSTALLATION
npx skills add https://github.com/firecrawl/cli --skill firecrawl
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

$2c

  • Concurrency: Max parallel jobs. Run parallel operations up to this limit.
  • Credits: Remaining API credits. Each operation consumes credits.

If not ready, see rules/install.md. For output handling guidelines, see rules/security.md.

Before doing real work, verify the setup with one small request:

mkdir -p .firecrawl

firecrawl scrape "https://firecrawl.dev" -o .firecrawl/install-check.md
firecrawl search "query" --scrape --limit 3

Workflow

Follow this escalation pattern:

  • Search - No specific URL yet. Find pages, answer questions, discover sources.
  • Scrape - Have a URL. Extract its content directly.
  • Map + Scrape - Large site or need a specific subpage. Use map --search to find the right URL, then scrape it.
  • Crawl - Need bulk content from an entire site section (e.g., all /docs/).
  • Interact - Scrape first, then interact with the page (pagination, modals, form submissions, multi-step navigation).

Need

Command

When

Find pages on a topic

search

No specific URL yet

Get a page's content

scrape

Have a URL, page is static or JS-rendered

Find URLs within a site

map

Need to locate a specific subpage

Bulk extract a site section

crawl

Need many pages (e.g., all /docs/)

AI-powered data extraction

agent

Need structured data from complex sites

Interact with a page

scrape + interact

Content requires clicks, form fills, pagination, or login

Download a site to files

download

Save an entire site as local files

Parse a local file

parse

File on disk (PDF, DOCX, XLSX, etc.) — not a URL

Watch pages for changes

monitor

Schedule recurring scrapes/crawls, diff against snapshots

For detailed command reference, run firecrawl <command> --help.

Scrape vs interact:

  • Use scrape first. It handles static pages and JS-rendered SPAs.
  • Use scrape + interact when you need to interact with a page, such as clicking buttons, filling out forms, navigating through a complex site, infinite scroll, or when scrape fails to grab all the content you need.
  • Never use interact for web searches - use search instead.

Monitor: Schedule recurring scrapes or crawls and diff each result against the last retained snapshot. Use for product pages, docs, blogs, changelogs, competitor sites — any page where changes matter. Each check labels pages as same, new, changed, removed, or error, with webhook and email notification options.

Subcommands: create | list | get | update | delete | run | checks | check.

# create from flags

firecrawl monitor create --name "Blog" --schedule "every 30 minutes" \

  --scrape-urls https://example.com/blog --email alerts@example.com

# or from JSON (positional file, or piped stdin)

firecrawl monitor create monitor.json

cat monitor.json | firecrawl monitor create

firecrawl monitor list --limit 20

firecrawl monitor run <monitorId>             # trigger a check now

firecrawl monitor checks <monitorId>          # list checks

firecrawl monitor check <monitorId> <checkId> --page-status changed

firecrawl monitor update <monitorId> --state paused

firecrawl monitor delete <monitorId>

Schedules accept cron (--cron "*/30 * * * *") or natural language (--schedule "every 30 minutes"). Minimum interval is 15 minutes. Targets are either --scrape-urls a,b,c (scrape) or --crawl-url <url> (crawl whole site each check). Note: --state (not --status) sets active/paused; --page-status (not --status) filters page results on check — avoids collision with the global --status flag. Monitoring is not available for zero-data-retention teams.

JSON-mode change tracking: By default monitors diff each page's markdown and you get a unified text diff back. When you care about specific structured fields (price, headline, in-stock flag, items in a list) instead of the whole page, add a changeTracking format with modes: ["json"] and a JSON schema to the target's scrapeOptions.formats. The flag-based form doesn't cover this — pass a JSON body via file or stdin:

cat > pricing-monitor.json <<'EOF'

{

  "name": "Pricing watch",

  "schedule": { "text": "hourly", "timezone": "UTC" },

  "targets": [{

    "type": "scrape",

    "urls": ["https://example.com/pricing"],

    "scrapeOptions": {

      "formats": [{

        "type": "changeTracking",

        "modes": ["json"],

        "prompt": "Extract pricing tiers and headline features for each plan.",

        "schema": {

          "type": "object",

          "properties": {

            "plans": {

              "type": "array",

              "items": {

                "type": "object",

                "properties": {

                  "name":     { "type": "string" },

                  "price":    { "type": "string" },

                  "features": { "type": "array", "items": { "type": "string" } }

                }

              }

            }

          }

        }

      }]

    }

  }]

}

EOF

firecrawl monitor create pricing-monitor.json

The check response then carries a per-field diff (paths like plans[0].price) and the full extraction at this run, instead of (or in addition to) a markdown diff. Each changed page in pages[] looks like:

{

  "url": "https://example.com/pricing",

  "status": "changed",

  "diff": {

    "json": {

      "plans[0].price": { "previous": "$19/mo", "current": "$24/mo" },

      "plans[1].features[2]": {

        "previous": "10 GB storage",

        "current": "25 GB storage"

      }

    }

  },

  "snapshot": {

    "json": {

      "plans": [

        /* current full extraction */

      ]

    }

  }

}

Use modes: ["json", "git-diff"] for mixed mode: you get both diff.json (per-field) and diff.text (markdown sidecar), and the page is marked changed whenever either surface changed. For markdown-only monitors, diff.text holds the unified diff and diff.json is a parse-diff AST ({ files: [...] }); there is no snapshot.

Avoid redundant fetches:

  • search --scrape already fetches full page content. Don't re-scrape those URLs.
  • Check .firecrawl/ for existing data before fetching again.

When to Load References

  • **Integrating Firecrawl into an app, adding FIRECRAWL_API_KEY to .env, or choosing endpoint usage in product code** -> use the firecrawl-build skills (already installed alongside this CLI skill)
  • Producing Firecrawl-powered deliverables such as research briefs, SEO audits, QA reports, lead lists, knowledge bases, or design-system extraction -> use the firecrawl-workflows skills (already installed alongside this CLI skill). These skills infer from context first and ask only short blocking questions when needed.

Output &#x26; Organization

Unless the user specifies to return in context, write results to .firecrawl/ with -o. Add .firecrawl/ to .gitignore. Always quote URLs - shell interprets ? and &#x26; as special characters.

firecrawl search "react hooks" -o .firecrawl/search-react-hooks.json --json

firecrawl scrape "<url>" -o .firecrawl/page.md

Naming conventions:

.firecrawl/search-{query}.json

.firecrawl/search-{query}-scraped.json

.firecrawl/{site}-{path}.md

Never read entire output files at once. Use grep, head, or incremental reads:

wc -l .firecrawl/file.md &#x26;&#x26; head -50 .firecrawl/file.md

grep -n "keyword" .firecrawl/file.md

Single format outputs raw content. Multiple formats (e.g., --format markdown,links) output JSON.

Working with Results

These patterns are useful when working with file-based output (-o flag) for complex tasks:

# Extract URLs from search

jq -r '.data.web[].url' .firecrawl/search.json

# Get titles and URLs

jq -r '.data.web[] | "\(.title): \(.url)"' .firecrawl/search.json

After search: send feedback (refunds 1 credit)

Search costs 2 credits per call. After you finish using a search result, send structured feedback in the background. The first feedback per search id refunds 1 credit and feeds search-quality improvements.

SEARCH_ID=$(jq -r '.id' .firecrawl/search-react-hooks.json)

firecrawl search-feedback "$SEARCH_ID" \

  --rating good \

  --valuable-sources '[{"url":"https://react.dev/reference/react/hooks","reason":"Authoritative"}]' \

  --missing-content '[{"topic":"useDeferredValue example"},{"topic":"Server Components hooks"}]' \

  --query-suggestions "Boost react.dev for react-hooks queries" \

  --silent &#x26;

The most useful field is --missing-content: an array of specific pieces of content you expected to find but didn't. Use one entry per missing topic. Bad/partial feedback with detailed --missing-content is just as valuable as good feedback.

Opt out: export FIRECRAWL_NO_SEARCH_FEEDBACK=1 makes the CLI skip every feedback call silently. Respect that flag — do not try to work around it. See firecrawl-search for the full pattern.

Parallelization

Run independent operations in parallel. Check firecrawl --status for concurrency limit:

firecrawl scrape "<url-1>" -o .firecrawl/1.md &#x26;

firecrawl scrape "<url-2>" -o .firecrawl/2.md &#x26;

firecrawl scrape "<url-3>" -o .firecrawl/3.md &#x26;

wait

For interact, scrape multiple pages and interact with each independently using their scrape IDs.

Credit Usage

firecrawl credit-usage

firecrawl credit-usage --json --pretty -o .firecrawl/credits.json
BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card