gpt-researcher

GPT Researcher is an autonomous deep research agent that conducts web and local research, producing detailed reports with citations. Use this skill when…

INSTALLATION
npx skills add https://github.com/assafelovic/gpt-researcher --skill gpt-researcher
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

GPT Researcher Development Skill

GPT Researcher is an LLM-based autonomous agent using a planner-executor-publisher pattern with parallelized agent work for speed and reliability.

Quick Start

Basic Python Usage

from gpt_researcher import GPTResearcher

import asyncio

async def main():

researcher = GPTResearcher(

query="What are the latest AI developments?",

report_type="research_report", # or detailed_report, deep, outline_report

report_source="web", # or local, hybrid

)

await researcher.conduct_research()

report = await researcher.write_report()

print(report)

asyncio.run(main())

### Run Servers

Backend

python -m uvicorn backend.server.server:app --reload --port 8000

Frontend

cd frontend/nextjs && npm install && npm run dev


## Key File Locations

Need
Primary File
Key Classes

Main orchestrator
`gpt_researcher/agent.py`
`GPTResearcher`

Research logic
`gpt_researcher/skills/researcher.py`
`ResearchConductor`

Report writing
`gpt_researcher/skills/writer.py`
`ReportGenerator`

All prompts
`gpt_researcher/prompts.py`
`PromptFamily`

Configuration
`gpt_researcher/config/config.py`
`Config`

Config defaults
`gpt_researcher/config/variables/default.py`
`DEFAULT_CONFIG`

API server
`backend/server/app.py`
FastAPI `app`

Search engines
`gpt_researcher/retrievers/`
Various retrievers

## Architecture Overview

User Query → GPTResearcher.__init__()

choose_agent() → (agent_type, role_prompt)

ResearchConductor.conduct_research()

├── plan_research() → sub_queries

├── For each sub_query:

│ └── _process_sub_query() → context

└── Aggregate contexts

[Optional] ImageGenerator.plan_and_generate_images()

ReportGenerator.write_report() → Markdown report


**For detailed architecture diagrams**: See [references/architecture.md](https://github.com/assafelovic/gpt-researcher/blob/HEAD/.claude/references/architecture.md)

## Core Patterns

### Adding a New Feature (8-Step Pattern)

- **Config** → Add to `gpt_researcher/config/variables/default.py`

- **Provider** → Create in `gpt_researcher/llm_provider/my_feature/`

- **Skill** → Create in `gpt_researcher/skills/my_feature.py`

- **Agent** → Integrate in `gpt_researcher/agent.py`

- **Prompts** → Update `gpt_researcher/prompts.py`

- **WebSocket** → Events via `stream_output()`

- **Frontend** → Handle events in `useWebSocket.ts`

- **Docs** → Create `docs/docs/gpt-researcher/gptr/my_feature.md`

**For complete feature addition guide with Image Generation case study**: See [references/adding-features.md](https://github.com/assafelovic/gpt-researcher/blob/HEAD/.claude/references/adding-features.md)

### Adding a New Retriever

1. Create: gpt_researcher/retrievers/my_retriever/my_retriever.py

class MyRetriever:

def __init__(self, query: str, headers: dict = None):

self.query = query

async def search(self, max_results: int = 10) -> list[dict]:

# Return: [{"title": str, "href": str, "body": str}]

pass

2. Register in gpt_researcher/actions/retriever.py

case "my_retriever":

from gpt_researcher.retrievers.my_retriever import MyRetriever

return MyRetriever

3. Export in gpt_researcher/retrievers/__init__.py


**For complete retriever documentation**: See [references/retrievers.md](https://github.com/assafelovic/gpt-researcher/blob/HEAD/.claude/references/retrievers.md)

## Configuration

Config keys are **lowercased** when accessed:

In default.py: "SMART_LLM": "gpt-4o"

Access as: self.cfg.smart_llm # lowercase!


Priority: Environment Variables → JSON Config File → Default Values

**For complete configuration reference**: See [references/config-reference.md](https://github.com/assafelovic/gpt-researcher/blob/HEAD/.claude/references/config-reference.md)

## Common Integration Points

### WebSocket Streaming

class WebSocketHandler:

async def send_json(self, data):

print(f"[{data['type']}] {data.get('output', '')}")

researcher = GPTResearcher(query="...", websocket=WebSocketHandler())


### MCP Data Sources

researcher = GPTResearcher(

query="Open source AI projects",

mcp_configs=[{

"name": "github",

"command": "npx",

"args": ["-y", "@modelcontextprotocol/server-github"],

"env": {"GITHUB_TOKEN": os.getenv("GITHUB_TOKEN")}

}],

mcp_strategy="deep", # or "fast", "disabled"

)


**For MCP integration details**: See [references/mcp.md](https://github.com/assafelovic/gpt-researcher/blob/HEAD/.claude/references/mcp.md)

### Deep Research Mode

researcher = GPTResearcher(

query="Comprehensive analysis of quantum computing",

report_type="deep", # Triggers recursive tree-like exploration

)


**For deep research configuration**: See [references/deep-research.md](https://github.com/assafelovic/gpt-researcher/blob/HEAD/.claude/references/deep-research.md)

## Error Handling

Always use graceful degradation in skills:

async def execute(self, ...):

if not self.is_enabled():

return [] # Don't crash

try:

result = await self.provider.execute(...)

return result

except Exception as e:

await stream_output("logs", "error", f"⚠️ {e}", self.websocket)

return [] # Graceful degradation

BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card