SKILL.md
GPT Researcher Development Skill
GPT Researcher is an LLM-based autonomous agent using a planner-executor-publisher pattern with parallelized agent work for speed and reliability.
Quick Start
Basic Python Usage
from gpt_researcher import GPTResearcher
import asyncio
async def main():
researcher = GPTResearcher(
query="What are the latest AI developments?",
report_type="research_report", # or detailed_report, deep, outline_report
report_source="web", # or local, hybrid
)
await researcher.conduct_research()
report = await researcher.write_report()
print(report)
asyncio.run(main())
### Run Servers
Backend
python -m uvicorn backend.server.server:app --reload --port 8000
Frontend
cd frontend/nextjs && npm install && npm run dev
## Key File Locations
Need
Primary File
Key Classes
Main orchestrator
`gpt_researcher/agent.py`
`GPTResearcher`
Research logic
`gpt_researcher/skills/researcher.py`
`ResearchConductor`
Report writing
`gpt_researcher/skills/writer.py`
`ReportGenerator`
All prompts
`gpt_researcher/prompts.py`
`PromptFamily`
Configuration
`gpt_researcher/config/config.py`
`Config`
Config defaults
`gpt_researcher/config/variables/default.py`
`DEFAULT_CONFIG`
API server
`backend/server/app.py`
FastAPI `app`
Search engines
`gpt_researcher/retrievers/`
Various retrievers
## Architecture Overview
User Query → GPTResearcher.__init__()
│
▼
choose_agent() → (agent_type, role_prompt)
│
▼
ResearchConductor.conduct_research()
├── plan_research() → sub_queries
├── For each sub_query:
│ └── _process_sub_query() → context
└── Aggregate contexts
│
▼
[Optional] ImageGenerator.plan_and_generate_images()
│
▼
ReportGenerator.write_report() → Markdown report
**For detailed architecture diagrams**: See [references/architecture.md](https://github.com/assafelovic/gpt-researcher/blob/HEAD/.claude/references/architecture.md)
## Core Patterns
### Adding a New Feature (8-Step Pattern)
- **Config** → Add to `gpt_researcher/config/variables/default.py`
- **Provider** → Create in `gpt_researcher/llm_provider/my_feature/`
- **Skill** → Create in `gpt_researcher/skills/my_feature.py`
- **Agent** → Integrate in `gpt_researcher/agent.py`
- **Prompts** → Update `gpt_researcher/prompts.py`
- **WebSocket** → Events via `stream_output()`
- **Frontend** → Handle events in `useWebSocket.ts`
- **Docs** → Create `docs/docs/gpt-researcher/gptr/my_feature.md`
**For complete feature addition guide with Image Generation case study**: See [references/adding-features.md](https://github.com/assafelovic/gpt-researcher/blob/HEAD/.claude/references/adding-features.md)
### Adding a New Retriever
1. Create: gpt_researcher/retrievers/my_retriever/my_retriever.py
class MyRetriever:
def __init__(self, query: str, headers: dict = None):
self.query = query
async def search(self, max_results: int = 10) -> list[dict]:
# Return: [{"title": str, "href": str, "body": str}]
pass
2. Register in gpt_researcher/actions/retriever.py
case "my_retriever":
from gpt_researcher.retrievers.my_retriever import MyRetriever
return MyRetriever
3. Export in gpt_researcher/retrievers/__init__.py
**For complete retriever documentation**: See [references/retrievers.md](https://github.com/assafelovic/gpt-researcher/blob/HEAD/.claude/references/retrievers.md)
## Configuration
Config keys are **lowercased** when accessed:
In default.py: "SMART_LLM": "gpt-4o"
Access as: self.cfg.smart_llm # lowercase!
Priority: Environment Variables → JSON Config File → Default Values
**For complete configuration reference**: See [references/config-reference.md](https://github.com/assafelovic/gpt-researcher/blob/HEAD/.claude/references/config-reference.md)
## Common Integration Points
### WebSocket Streaming
class WebSocketHandler:
async def send_json(self, data):
print(f"[{data['type']}] {data.get('output', '')}")
researcher = GPTResearcher(query="...", websocket=WebSocketHandler())
### MCP Data Sources
researcher = GPTResearcher(
query="Open source AI projects",
mcp_configs=[{
"name": "github",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": {"GITHUB_TOKEN": os.getenv("GITHUB_TOKEN")}
}],
mcp_strategy="deep", # or "fast", "disabled"
)
**For MCP integration details**: See [references/mcp.md](https://github.com/assafelovic/gpt-researcher/blob/HEAD/.claude/references/mcp.md)
### Deep Research Mode
researcher = GPTResearcher(
query="Comprehensive analysis of quantum computing",
report_type="deep", # Triggers recursive tree-like exploration
)
**For deep research configuration**: See [references/deep-research.md](https://github.com/assafelovic/gpt-researcher/blob/HEAD/.claude/references/deep-research.md)
## Error Handling
Always use graceful degradation in skills:
async def execute(self, ...):
if not self.is_enabled():
return [] # Don't crash
try:
result = await self.provider.execute(...)
return result
except Exception as e:
await stream_output("logs", "error", f"⚠️ {e}", self.websocket)
return [] # Graceful degradation