SKILL.md
$27
"1. PRIME" [style=filled, fillcolor="#e8e8ff"];
"2. WAVE 1: Broad Sweep" [style=filled, fillcolor="#ffe8e8"];
"3. GAP ANALYSIS" [style=filled, fillcolor="#fff8e0"];
"4. WAVE 2+: Targeted" [style=filled, fillcolor="#ffe8e8"];
"5. SYNTHESIZE" [style=filled, fillcolor="#e8ffe8"];
"6. DECIDE & RECORD" [style=filled, fillcolor="#e8e8ff"];
"1. PRIME" -> "2. WAVE 1: Broad Sweep";
"2. WAVE 1: Broad Sweep" -> "3. GAP ANALYSIS";
"3. GAP ANALYSIS" -> "4. WAVE 2+: Targeted";
"4. WAVE 2+: Targeted" -> "3. GAP ANALYSIS" [label="still gaps", style=dashed];
"3. GAP ANALYSIS" -> "5. SYNTHESIZE" [label="coverage sufficient"];
"5. SYNTHESIZE" -> "6. DECIDE & RECORD";
}
---
## Phase 1: PRIME
Lean on existing knowledge before spawning agents. Re-running research that already lives in Sibyl burns tokens and produces duplicate entries.
### Common moves
- **Search Sibyl first:** `sibyl search "<research topic>"`, `sibyl search "<related technology>"`, `sibyl search "<prior decision in this area>"`. Surface what's already known before generating new findings.
- **Check for staleness.** Fast-moving topics (frameworks, models, cloud services) usually warrant re-research even when Sibyl has recent entries; treat the existing knowledge as a baseline. Stable topics with recent entries often don't need a fresh pass at all.
- **Sharpen the research question.** "Research databases" is too vague to dispatch on. "Compare PostgreSQL vs CockroachDB for multi-region write-heavy workloads with <10ms p99 latency" gives agents enough scope to do useful work.
- **Calibrate the research budget** to the decision the research is feeding:
| Depth | Agents | Time | When |
| -------------- | ------ | --------- | -------------------------------------------- |
| **Quick scan** | 2-3 | 2-5 min | Known domain, just need latest info |
| **Standard** | 5-10 | 10-15 min | Technology evaluation, architecture options |
| **Deep dive** | 10-30 | 20-40 min | Greenfield decisions, SOTA analysis |
| **Exhaustive** | 30-60+ | 40-90 min | New project inception, competitive landscape |
### Source quality contract
This bit is non-negotiable: the value of research collapses when claims rest on stale blog posts. Specific claim types deserve specific source standards:
| Claim type | Preferred source |
| ----------------------- | ------------------------------------------------ |
| Current version | Package registry, release page, or official CLI |
| CLI flags / config keys | Official docs or local `--help` output |
| Security frameworks | OWASP, NIST, SLSA/OpenSSF, CIS, ISO, PCI sources |
| Cloud/provider behavior | Provider docs and current changelog |
| Research papers / SOTA | Paper, benchmark repo, or authors' artifact |
| Community health | Repository activity plus issue/release cadence |
When primary sources disagree with secondary ones, trust the primary source and note the discrepancy. Date volatile facts explicitly, and prefer commands/sources the next agent can rerun over screenshots that go stale.
---
## Phase 2: WAVE 1: Broad Sweep
Deploy the first wave of agents across the full research surface. The goal is breadth; accept that some agents will produce mediocre output, that's what gap analysis is for.
### What good agent prompts have
Vague prompts produce vague research. Each agent benefits from:
- **One specific topic** (not "research everything about X")
- **An output file path** (no ambiguity about where to write)
- **Search hints** (include year: "search [topic] 2026")
- **8-12 numbered coverage items** that scope the research precisely
- **Source quality guidance** ("prefer official docs and GitHub repos over blog posts")
### Wave 1 Template
Research [SPECIFIC_TOPIC] for [PROJECT/DECISION].
Create a research doc at docs/research/[filename].md covering:
- Current state (latest version, recent changes)
- [Specific capability A relevant to our use case]
- [Specific capability B]
- [Integration with our stack: list specific technologies]
- Performance characteristics / benchmarks
- Known limitations and gotchas
- Community health (stars, activity, maintenance)
- Comparison with alternatives (name 2-3 specific alternatives)
Use WebSearch for current information. Include dates on all facts.
Cite sources with URLs.
### Deployment notes
- **Background by default.** Research agents have no inter-dependencies, so foreground execution serializes work that should run in parallel.
- **3-4 seconds between dispatches** avoids rate limiting in practice. Tighter cadences sometimes work, sometimes hit limits, so pace yourself.
- **One file per agent.** Shared outputs create write contention and lose attribution.
- **Group by theme** when researching many topics. 12 separate dispatches become 3-4 thematic clusters with clearer synthesis later.
### Coverage Strategy
For technology evaluations, cover these dimensions:
Dimension
Question
**Capability**
Does it do what we need?
**Performance**
Is it fast enough?
**Ecosystem**
Does it integrate with our stack?
**Maturity**
Is it production-ready?
**Community**
Will it be maintained in 2 years?
**Cost**
What does it cost at our scale?
**Migration**
How hard is it to adopt/abandon?
## Phase 3: GAP ANALYSIS
After Wave 1, look for what's missing before synthesizing. Premature synthesis is the most common research failure: the answer feels obvious after three docs and turns out to be wrong after eight.
### What to look for
- **Coverage gaps**: dimensions the wave didn't touch, missing comparisons, questions raised but not answered
- **Contradictions**: agents reaching different conclusions on the same question (often signal for verification agents)
- **Bias signals**: all-positive findings (suspicious, look for failure cases), only-official-docs (need community experience), same sources cited repeatedly (need source diversity)
### Decision Point
Finding
Action
Good coverage, minor gaps
Synthesize now, note gaps
Significant gaps
Deploy Wave 2 targeted agents
Contradictory findings
Deploy verification agents to resolve
Entirely new direction emerged
Deploy Wave 2 in new direction
## Phase 4: WAVE 2+: Targeted Research
Fill specific gaps identified in the analysis. Wave 2 agents differ from Wave 1 in shape:
- **Smaller scope**: one specific question per agent
- **Higher quality bar**: "find production experience reports, not just docs"
- **Cross-reference prompts**: "Agent X found [claim], verify against [alternative source]"
- **Deep reads**: "Read the full README and API docs for [library], not just the landing page"
### When to stop
Stop deploying waves when the research question can be answered with confidence, when additional agents would produce diminishing returns, when key claims have 2+ independent sources, or when the user signals "enough, let's decide."
Three waves is usually the practical ceiling. Past that, more research rarely sharpens the answer; it usually means the question itself needs reframing.
## Phase 5: SYNTHESIZE
**Combine all findings into actionable intelligence. This is where the magic happens.**
### Synthesis Structure
Research: [Topic]
TL;DR
[2-3 sentences. The answer, not the journey.]
Recommendation
[Clear choice with justification. Don't hedge, pick one.]
Options Evaluated
| Option | Fit | Maturity | Perf | Ecosystem | Verdict |
|---|---|---|---|---|---|
| A | ... | ... | ... | ... | Best for [X] |
| B | ... | ... | ... | ... | Best for [Y] |
| C | ... | ... | ... | ... | Avoid: [reason] |
Key Findings
- [Most important finding with source]
- [Second most important]
- [Third most important]
Risks & Gotchas
- [Known issue or limitation]
- [Migration complexity]
- [Hidden cost]
Sources
- [Source 1](url): [what it contributed]
- [Source 2](url): [what it contributed]
### Synthesis principles
- **Lead with the recommendation.** Forcing the reader to wade through findings to find the answer is the most common synthesis failure.
- **Separate facts from opinions.** "PostgreSQL supports JSONB" (fact) vs "PostgreSQL is better for this use case" (opinion backed by evidence). Both are useful; conflating them isn't.
- **Include dissenting evidence.** If one source contradicts the recommendation, name it. Cherry-picked synthesis is worse than no synthesis.
- **Date everything.** "As of [month] [year], [library] is at v4.2." Research spoils fast.
- **Note confidence level.** "High confidence: well-documented" / "Low confidence: based on one blog post" gives the reader the calibration they need.
## Phase 6: DECIDE & RECORD
**Lock in the decision and capture it for future sessions.**
### Actions
-
**Present the synthesis** to the user with a clear recommendation
-
**Record in Sibyl:**
sibyl add "Research: [topic]" "Evaluated [options]. Chose [X] because [reasons]. Key risk: [Y]. Sources: [primary URLs]. Date: [today]."
-
**Archive research docs**: keep the wave outputs for reference:
- If in a project: `docs/research/[topic]/`
- If general knowledge: Sibyl learning entry is sufficient
-
**Exit to next action:**
Next Step
When
`/hyperskills:brainstorm`
Research surfaced multiple viable approaches
`/hyperskills:plan`
Decision made, ready to decompose implementation
`/hyperskills:orchestrate`
Decision made, work is parallelizable
Direct implementation
Research confirmed a simple path
## Quick Research Mode
For focused questions that don't need the full wave protocol:
- **Search Sibyl** (always)
- **2-3 targeted searches** (WebSearch + WebFetch on key URLs)
- **Synthesize inline** (no separate docs)
- **Record if non-obvious** (Sibyl learning)
**Use when:** "What's the latest version of X?", "Does Y support Z?", "What's the recommended way to do W?"
## Research Patterns by Type
### Technology Evaluation
Wave 1: Official docs + GitHub README for each option (parallel)
Wave 2: Production experience + benchmarks (parallel)
Synthesize: Comparison matrix + recommendation
### Codebase Archaeology
Wave 1: Explore agents mapping each subsystem (parallel)
Wave 2: Grep for specific patterns / usage (parallel)
Synthesize: Architecture diagram + dependency map
### SOTA Analysis
Wave 1: WebSearch for latest papers, blog posts, releases (parallel)
Wave 2: Deep read the most relevant 3-5 sources (parallel)
Synthesize: What's genuinely novel vs rehashed + recommendation
### Competitive Landscape
Wave 1: Feature matrix for each competitor (parallel)
Wave 2: Pricing, community size, trajectory (parallel)
Synthesize: Positioning matrix + gap analysis