SKILL.md
Firecrawl Knowledge Base
Use this to turn URLs or topics into organized LLM-ready content.
Onboarding Interview
Infer the source, goal, depth, and output location from context. If the source and goal are clear, proceed immediately.
Ask at most 1-3 concise questions only if blocked, such as the source URL/topic, whether the output is reference/RAG/training/docs, or training format if training is requested.
Firecrawl Collection Plan
Use Firecrawl map for documentation sites, search for topic-based corpora, scrape pages into markdown, and preserve code examples and tables.
For files, follow the Firecrawl download-style convention:
.firecrawl/
<hostname>/
<path>/
index.md
Parallel Work
If appropriate, use sub-agents or equivalent parallel task runners:
- one docs section per researcher
- official docs, tutorials, community discussions, and references by source type
- source scraping vs chunk generation vs manifest generation
Output Modes
- Reference: markdown files,
index.md, andsources.json.
- RAG: markdown files plus chunk files and
manifest.json.
- Training: scraped source files plus
training-data.jsonlandtraining-metadata.json.
- Docs mirror: complete markdown mirror with a table of contents.
Final Deliverable
# Knowledge Base: [Source]
## Summary
[What was collected and why]
## Output Structure
[Files/directories created]
## Coverage
[Sections, source types, counts]
## Usage Notes
[How to use in RAG, docs, training, or agent context]
## Sources
[URLs collected]
## Rerun Inputs
workflow: firecrawl-knowledge-base
source: [url/topic]
goal: [reference/rag/train/docs]
depth: [quick/thorough/exhaustive]
output_dir: [.firecrawl/]
Quality Bar
- Preserve code examples and formatting.
- Remove boilerplate navigation where possible.
- Include source URLs in frontmatter or metadata.