SKILL.md

$2b

Core RAG

Fundamental patterns for retrieval, generation, and pipeline composition.

Rule

File

Key Pattern

Basic RAG

rules/core-basic-rag.md

Retrieve + context + generate with citations

Hybrid Search

rules/core-hybrid-search.md

RRF fusion (k=60) for semantic + keyword

Context Management

rules/core-context-management.md

Token budgeting + sufficiency check

Pipeline Composition

rules/core-pipeline-composition.md

Composable Decompose → HyDE → Retrieve → Rerank

Embeddings

Embedding models, chunking strategies, and production optimization.

Rule

File

Key Pattern

Models & API

rules/embeddings-models.md

Model selection, batch API, similarity

Chunking

rules/embeddings-chunking.md

Semantic boundary splitting, 512 token sweet spot

Advanced

rules/embeddings-advanced.md

Redis cache, Matryoshka dims, batch processing

Contextual Retrieval

Anthropic's context-prepending technique — 67% fewer retrieval failures.

Rule

File

Key Pattern

Context Prepending

rules/contextual-prepend.md

LLM-generated context + prompt caching

Hybrid Search

rules/contextual-hybrid.md

40% BM25 / 60% vector weight split

Complete Pipeline

rules/contextual-pipeline.md

End-to-end indexing + hybrid retrieval

HyDE

Hypothetical Document Embeddings for bridging vocabulary gaps.

Rule

File

Key Pattern

Generation

rules/hyde-generation.md

Embed hypothetical doc, not query

Per-Concept

rules/hyde-per-concept.md

Parallel HyDE for multi-topic queries

Fallback

rules/hyde-fallback.md

2-3s timeout → direct embedding fallback

Agentic RAG

Self-correcting retrieval with LLM-driven decision making.

Rule

File

Key Pattern

Self-RAG

rules/agentic-self-rag.md

Binary document grading for relevance

Corrective RAG

rules/agentic-corrective-rag.md

CRAG workflow with web fallback

Knowledge Graph

rules/agentic-knowledge-graph.md

KG + vector hybrid for entity-rich domains

Adaptive Retrieval

rules/agentic-adaptive-retrieval.md

Query routing to optimal strategy

Multimodal RAG

Image + text retrieval with cross-modal search.

Rule

File

Key Pattern

Embeddings

rules/multimodal-embeddings.md

CLIP, SigLIP 2, Voyage multimodal-3

Chunking

rules/multimodal-chunking.md

PDF extraction preserving images

Pipeline

rules/multimodal-pipeline.md

Dedup + hybrid retrieval + generation

Query Decomposition

Breaking complex queries into concepts for parallel retrieval.

Rule

File

Key Pattern

Detection

rules/query-detection.md

Heuristic indicators (<1ms fast path)

Decompose + RRF

rules/query-decompose.md

LLM concept extraction + parallel retrieval

HyDE Combo

rules/query-hyde-combo.md

Decompose + HyDE for maximum coverage

Reranking

Post-retrieval re-scoring for higher precision.

Rule

File

Key Pattern

Cross-Encoder

rules/reranking-cross-encoder.md

ms-marco-MiniLM (~50ms, free)

LLM Reranking

rules/reranking-llm.md

Batch scoring + Cohere API

Combined

rules/reranking-combined.md

Multi-signal weighted scoring

PGVector

Production hybrid search with PostgreSQL.

Rule

File

Key Pattern

Schema

rules/pgvector-schema.md

HNSW index + pre-computed tsvector

Hybrid Search

rules/pgvector-hybrid-search.md

SQLAlchemy RRF with FULL OUTER JOIN

Indexing

rules/pgvector-indexing.md

HNSW (17x faster) vs IVFFlat

Metadata

rules/pgvector-metadata.md

Filtering, boosting, Redis 8 comparison

Quick Start Example

from openai import OpenAI

client = OpenAI()

async def rag_query(question: str, top_k: int = 5) -> dict:

    """Basic RAG with citations."""

    docs = await vector_db.search(question, limit=top_k)

    context = "\n\n".join([f"[{i+1}] {doc.text}" for i, doc in enumerate(docs)])

    response = await llm.chat([

        {"role": "system", "content": "Answer with inline citations [1], [2]. Use ONLY provided context."},

        {"role": "user", "content": f"Context:\n{context}\n\nQuestion: {question}"}

    ])

    return {"answer": response.content, "sources": [d.metadata['source'] for d in docs]}

Key Decisions

Decision

Recommendation

Embedding model

text-embedding-3-small (general), voyage-3 (production)

Chunk size

256-1024 tokens (512 typical)

Hybrid weight

40% BM25 / 60% vector

Top-k

3-10 documents

Temperature

0.1-0.3 (factual)

Context budget

4K-8K tokens

Reranking

Retrieve 50, rerank to 10

Vector index

HNSW (production), IVFFlat (high-volume)

HyDE timeout

2-3 seconds with fallback

Query decomposition

Heuristic first, LLM only if multi-concept

Common Mistakes

No citation tracking (unverifiable answers)

Context too large (dilutes relevance)

Single retrieval method (misses keyword matches)

Not chunking long documents (context gets lost)

Embedding queries differently than documents

No fallback path in agentic RAG (workflow hangs)

Infinite rewrite loops (no retry limit)

Using wrong similarity metric (cosine vs euclidean)

Not caching embeddings (recomputing unchanged content)

Missing image captions in multimodal RAG (limits text search)

Evaluations

See test-cases.json for 30 test cases across all categories.

Related Skills

ork:langgraph - LangGraph workflow patterns (for agentic RAG workflows)

caching - Cache RAG responses for repeated queries

ork:golden-dataset - Evaluate retrieval quality

ork:llm-integration - Local embeddings with nomic-embed-text

vision-language-models - Image analysis for multimodal RAG

ork:database-patterns - Schema design for vector search

Capability Details

retrieval-patterns

Keywords: retrieval, context, chunks, relevance, rag

Solves:

Retrieve relevant context for LLM

Implement RAG pipeline with citations

Optimize retrieval quality

hybrid-search

Keywords: hybrid, bm25, vector, fusion, rrf

Solves:

Combine keyword and semantic search

Implement reciprocal rank fusion

Balance precision and recall

embeddings

Keywords: embedding, text to vector, vectorize, chunk, similarity

Solves:

Convert text to vector embeddings

Choose embedding models and dimensions

Implement chunking strategies

contextual-retrieval

Keywords: contextual, anthropic, context-prepend, bm25

Solves:

Prepend context to chunks for better retrieval

Reduce retrieval failures by 67%

Implement hybrid BM25+vector search

hyde

Keywords: hyde, hypothetical, vocabulary mismatch

Solves:

Bridge vocabulary gaps in semantic search

Generate hypothetical documents for embedding

Handle abstract or conceptual queries

agentic-rag

Keywords: self-rag, crag, corrective, adaptive, grading

Solves:

Build self-correcting RAG workflows

Grade document relevance

Implement web search fallback

multimodal-rag

Keywords: multimodal, image, clip, vision, pdf

Solves:

Build RAG with images and text

Cross-modal search (text → image)

Process PDFs with mixed content

query-decomposition

Keywords: decompose, multi-concept, complex query

Solves:

Break complex queries into concepts

Parallel retrieval per concept

Improve coverage for compound questions

reranking

Keywords: rerank, cross-encoder, precision, scoring

Solves:

Improve search precision post-retrieval

Score relevance with cross-encoder or LLM

Combine multiple scoring signals

pgvector-search

Keywords: pgvector, postgresql, hnsw, tsvector, hybrid

Solves:

Production hybrid search with PostgreSQL

HNSW vs IVFFlat index selection

SQL-based RRF fusion

rag-retrieval

SKILL.md

Core RAG

Embeddings

Contextual Retrieval

HyDE

Agentic RAG

Multimodal RAG

Query Decomposition

Reranking

PGVector

Quick Start Example

Key Decisions

Common Mistakes

Evaluations

Related Skills

Capability Details

retrieval-patterns

hybrid-search

embeddings

contextual-retrieval

hyde

agentic-rag

multimodal-rag

query-decomposition

reranking

pgvector-search

Stop writing automation&scrapers

rag-retrieval

SKILL.md

Core RAG

Embeddings

Contextual Retrieval

HyDE

Agentic RAG

Multimodal RAG

Query Decomposition

Reranking

PGVector

Quick Start Example

Key Decisions

Common Mistakes

Evaluations

Related Skills

Capability Details

retrieval-patterns

hybrid-search

embeddings

contextual-retrieval

hyde

agentic-rag

multimodal-rag

query-decomposition

reranking

pgvector-search

Let your agent run on any real-world website

Related skills

Stop writing automation&scrapers