SKILL.md

$27

"1. PRIME" [style=filled, fillcolor="#e8e8ff"];

"2. WAVE 1: Broad Sweep" [style=filled, fillcolor="#ffe8e8"];

"3. GAP ANALYSIS" [style=filled, fillcolor="#fff8e0"];

"4. WAVE 2+: Targeted" [style=filled, fillcolor="#ffe8e8"];

"5. SYNTHESIZE" [style=filled, fillcolor="#e8ffe8"];

"6. DECIDE &#x26; RECORD" [style=filled, fillcolor="#e8e8ff"];

"1. PRIME" -> "2. WAVE 1: Broad Sweep";

"2. WAVE 1: Broad Sweep" -> "3. GAP ANALYSIS";

"3. GAP ANALYSIS" -> "4. WAVE 2+: Targeted";

"4. WAVE 2+: Targeted" -> "3. GAP ANALYSIS" [label="still gaps", style=dashed];

"3. GAP ANALYSIS" -> "5. SYNTHESIZE" [label="coverage sufficient"];

"5. SYNTHESIZE" -> "6. DECIDE &#x26; RECORD";

}

---

## Phase 1: PRIME

Lean on existing knowledge before spawning agents. Re-running research that already lives in Sibyl burns tokens and produces duplicate entries.

### Common moves

- **Search Sibyl first:** `sibyl search "<research topic>"`, `sibyl search "<related technology>"`, `sibyl search "<prior decision in this area>"`. Surface what's already known before generating new findings.

- **Check for staleness.** Fast-moving topics (frameworks, models, cloud services) usually warrant re-research even when Sibyl has recent entries; treat the existing knowledge as a baseline. Stable topics with recent entries often don't need a fresh pass at all.

- **Sharpen the research question.** "Research databases" is too vague to dispatch on. "Compare PostgreSQL vs CockroachDB for multi-region write-heavy workloads with <10ms p99 latency" gives agents enough scope to do useful work.

- **Calibrate the research budget** to the decision the research is feeding:

   | Depth          | Agents | Time      | When                                         |

   | -------------- | ------ | --------- | -------------------------------------------- |

   | **Quick scan** | 2-3    | 2-5 min   | Known domain, just need latest info          |

   | **Standard**   | 5-10   | 10-15 min | Technology evaluation, architecture options  |

   | **Deep dive**  | 10-30  | 20-40 min | Greenfield decisions, SOTA analysis          |

   | **Exhaustive** | 30-60+ | 40-90 min | New project inception, competitive landscape |

### Source quality contract

This bit is non-negotiable: the value of research collapses when claims rest on stale blog posts. Specific claim types deserve specific source standards:

| Claim type              | Preferred source                                 |

| ----------------------- | ------------------------------------------------ |

| Current version         | Package registry, release page, or official CLI  |

| CLI flags / config keys | Official docs or local `--help` output           |

| Security frameworks     | OWASP, NIST, SLSA/OpenSSF, CIS, ISO, PCI sources |

| Cloud/provider behavior | Provider docs and current changelog              |

| Research papers / SOTA  | Paper, benchmark repo, or authors' artifact      |

| Community health        | Repository activity plus issue/release cadence   |

When primary sources disagree with secondary ones, trust the primary source and note the discrepancy. Date volatile facts explicitly, and prefer commands/sources the next agent can rerun over screenshots that go stale.

---

## Phase 2: WAVE 1: Broad Sweep

Deploy the first wave of agents across the full research surface. The goal is breadth; accept that some agents will produce mediocre output, that's what gap analysis is for.

### What good agent prompts have

Vague prompts produce vague research. Each agent benefits from:

- **One specific topic** (not "research everything about X")

- **An output file path** (no ambiguity about where to write)

- **Search hints** (include year: "search [topic] 2026")

- **8-12 numbered coverage items** that scope the research precisely

- **Source quality guidance** ("prefer official docs and GitHub repos over blog posts")

### Wave 1 Template

Research [SPECIFIC_TOPIC] for [PROJECT/DECISION].

Create a research doc at docs/research/[filename].md covering:

Current state (latest version, recent changes)

[Specific capability A relevant to our use case]

[Specific capability B]

[Integration with our stack: list specific technologies]

Performance characteristics / benchmarks

Known limitations and gotchas

Community health (stars, activity, maintenance)

Comparison with alternatives (name 2-3 specific alternatives)

Use WebSearch for current information. Include dates on all facts.

Cite sources with URLs.


### Deployment notes

- **Background by default.** Research agents have no inter-dependencies, so foreground execution serializes work that should run in parallel.

- **3-4 seconds between dispatches** avoids rate limiting in practice. Tighter cadences sometimes work, sometimes hit limits, so pace yourself.

- **One file per agent.** Shared outputs create write contention and lose attribution.

- **Group by theme** when researching many topics. 12 separate dispatches become 3-4 thematic clusters with clearer synthesis later.

### Coverage Strategy

For technology evaluations, cover these dimensions:

Dimension
Question

**Capability**
Does it do what we need?

**Performance**
Is it fast enough?

**Ecosystem**
Does it integrate with our stack?

**Maturity**
Is it production-ready?

**Community**
Will it be maintained in 2 years?

**Cost**
What does it cost at our scale?

**Migration**
How hard is it to adopt/abandon?

## Phase 3: GAP ANALYSIS

After Wave 1, look for what's missing before synthesizing. Premature synthesis is the most common research failure: the answer feels obvious after three docs and turns out to be wrong after eight.

### What to look for

- **Coverage gaps**: dimensions the wave didn't touch, missing comparisons, questions raised but not answered

- **Contradictions**: agents reaching different conclusions on the same question (often signal for verification agents)

- **Bias signals**: all-positive findings (suspicious, look for failure cases), only-official-docs (need community experience), same sources cited repeatedly (need source diversity)

### Decision Point

Finding
Action

Good coverage, minor gaps
Synthesize now, note gaps

Significant gaps
Deploy Wave 2 targeted agents

Contradictory findings
Deploy verification agents to resolve

Entirely new direction emerged
Deploy Wave 2 in new direction

## Phase 4: WAVE 2+: Targeted Research

Fill specific gaps identified in the analysis. Wave 2 agents differ from Wave 1 in shape:

- **Smaller scope**: one specific question per agent

- **Higher quality bar**: "find production experience reports, not just docs"

- **Cross-reference prompts**: "Agent X found [claim], verify against [alternative source]"

- **Deep reads**: "Read the full README and API docs for [library], not just the landing page"

### When to stop

Stop deploying waves when the research question can be answered with confidence, when additional agents would produce diminishing returns, when key claims have 2+ independent sources, or when the user signals "enough, let's decide."

Three waves is usually the practical ceiling. Past that, more research rarely sharpens the answer; it usually means the question itself needs reframing.

## Phase 5: SYNTHESIZE

**Combine all findings into actionable intelligence. This is where the magic happens.**

### Synthesis Structure

Research: [Topic]

TL;DR

[2-3 sentences. The answer, not the journey.]

Recommendation

[Clear choice with justification. Don't hedge, pick one.]

Options Evaluated

Option	Fit	Maturity	Perf	Ecosystem	Verdict
A	...	...	...	...	Best for [X]
B	...	...	...	...	Best for [Y]
C	...	...	...	...	Avoid: [reason]

Key Findings

[Most important finding with source]

[Second most important]

[Third most important]

Risks & Gotchas

[Known issue or limitation]

[Migration complexity]

[Hidden cost]

Sources

[Source 1](url): [what it contributed]

[Source 2](url): [what it contributed]


### Synthesis principles

- **Lead with the recommendation.** Forcing the reader to wade through findings to find the answer is the most common synthesis failure.

- **Separate facts from opinions.** "PostgreSQL supports JSONB" (fact) vs "PostgreSQL is better for this use case" (opinion backed by evidence). Both are useful; conflating them isn't.

- **Include dissenting evidence.** If one source contradicts the recommendation, name it. Cherry-picked synthesis is worse than no synthesis.

- **Date everything.** "As of [month] [year], [library] is at v4.2." Research spoils fast.

- **Note confidence level.** "High confidence: well-documented" / "Low confidence: based on one blog post" gives the reader the calibration they need.

## Phase 6: DECIDE &#x26; RECORD

**Lock in the decision and capture it for future sessions.**

### Actions

-

**Present the synthesis** to the user with a clear recommendation

-

**Record in Sibyl:**

sibyl add "Research: [topic]" "Evaluated [options]. Chose [X] because [reasons]. Key risk: [Y]. Sources: [primary URLs]. Date: [today]."


-

**Archive research docs**: keep the wave outputs for reference:

- If in a project: `docs/research/[topic]/`

- If general knowledge: Sibyl learning entry is sufficient

-

**Exit to next action:**

Next Step
When

`/hyperskills:brainstorm`
Research surfaced multiple viable approaches

`/hyperskills:plan`
Decision made, ready to decompose implementation

`/hyperskills:orchestrate`
Decision made, work is parallelizable

Direct implementation
Research confirmed a simple path

## Quick Research Mode

For focused questions that don't need the full wave protocol:

- **Search Sibyl** (always)

- **2-3 targeted searches** (WebSearch + WebFetch on key URLs)

- **Synthesize inline** (no separate docs)

- **Record if non-obvious** (Sibyl learning)

**Use when:** "What's the latest version of X?", "Does Y support Z?", "What's the recommended way to do W?"

## Research Patterns by Type

### Technology Evaluation

Wave 1: Official docs + GitHub README for each option (parallel)

Wave 2: Production experience + benchmarks (parallel)

Synthesize: Comparison matrix + recommendation


### Codebase Archaeology

Wave 1: Explore agents mapping each subsystem (parallel)

Wave 2: Grep for specific patterns / usage (parallel)

Synthesize: Architecture diagram + dependency map


### SOTA Analysis

Wave 1: WebSearch for latest papers, blog posts, releases (parallel)

Wave 2: Deep read the most relevant 3-5 sources (parallel)

Synthesize: What's genuinely novel vs rehashed + recommendation


### Competitive Landscape

Wave 1: Feature matrix for each competitor (parallel)

Wave 2: Pricing, community size, trajectory (parallel)

Synthesize: Positioning matrix + gap analysis

research

SKILL.md

Research: [Topic]

TL;DR

Recommendation

Options Evaluated

Key Findings

Risks &#x26; Gotchas

Sources

Let your agent run on any real-world website

Related skills

Stop writing automation&scrapers

Risks & Gotchas