SKILL.md

LangChain - Build LLM Applications with Agents & RAG

The most popular framework for building LLM-powered applications.

When to use LangChain

Use LangChain when:

Building agents with tool calling and reasoning (ReAct pattern)

Implementing RAG (retrieval-augmented generation) pipelines

Need to swap LLM providers easily (OpenAI, Anthropic, Google)

Creating chatbots with conversation memory

Rapid prototyping of LLM applications

Production deployments with LangSmith observability

Metrics:

119,000+ GitHub stars

272,000+ repositories use LangChain

500+ integrations (models, vector stores, tools)

3,800+ contributors

Use alternatives instead:

LlamaIndex: RAG-focused, better for document Q&A

LangGraph: Complex stateful workflows, more control

Haystack: Production search pipelines

Semantic Kernel: Microsoft ecosystem

Quick start

Installation

# Core library (Python 3.10+)

pip install -U langchain

# With OpenAI

pip install langchain-openai

# With Anthropic

pip install langchain-anthropic

# Common extras

pip install langchain-community  # 500+ integrations

pip install langchain-chroma     # Vector store

Basic LLM usage

from langchain_anthropic import ChatAnthropic

# Initialize model

llm = ChatAnthropic(model="claude-sonnet-4-5-20250929")

# Simple completion

response = llm.invoke("Explain quantum computing in 2 sentences")

print(response.content)

Create an agent (ReAct pattern)

from langchain.agents import create_agent

from langchain_anthropic import ChatAnthropic

# Define tools

def get_weather(city: str) -> str:

    """Get current weather for a city."""

    return f"It's sunny in {city}, 72°F"

def search_web(query: str) -> str:

    """Search the web for information."""

    return f"Search results for: {query}"

# Create agent (<10 lines!)

agent = create_agent(

    model=ChatAnthropic(model="claude-sonnet-4-5-20250929"),

    tools=[get_weather, search_web],

    system_prompt="You are a helpful assistant. Use tools when needed."

)

# Run agent

result = agent.invoke({"messages": [{"role": "user", "content": "What's the weather in Paris?"}]})

print(result["messages"][-1].content)

Core concepts

1. Models - LLM abstraction

from langchain_openai import ChatOpenAI

from langchain_anthropic import ChatAnthropic

from langchain_google_genai import ChatGoogleGenerativeAI

# Swap providers easily

llm = ChatOpenAI(model="gpt-4o")

llm = ChatAnthropic(model="claude-sonnet-4-5-20250929")

llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash-exp")

# Streaming

for chunk in llm.stream("Write a poem"):

    print(chunk.content, end="", flush=True)

2. Chains - Sequential operations

from langchain.chains import LLMChain

from langchain.prompts import PromptTemplate

# Define prompt template

prompt = PromptTemplate(

    input_variables=["topic"],

    template="Write a 3-sentence summary about {topic}"

)

# Create chain

chain = LLMChain(llm=llm, prompt=prompt)

# Run chain

result = chain.run(topic="machine learning")

3. Agents - Tool-using reasoning

ReAct (Reasoning + Acting) pattern:

from langchain.agents import create_tool_calling_agent, AgentExecutor

from langchain.tools import Tool

# Define custom tool

calculator = Tool(

    name="Calculator",

    func=lambda x: eval(x),

    description="Useful for math calculations. Input: valid Python expression."

)

# Create agent with tools

agent = create_tool_calling_agent(

    llm=llm,

    tools=[calculator, search_web],

    prompt="Answer questions using available tools"

)

# Create executor

agent_executor = AgentExecutor(agent=agent, tools=[calculator], verbose=True)

# Run with reasoning

result = agent_executor.invoke({"input": "What is 25 * 17 + 142?"})

4. Memory - Conversation history

from langchain.memory import ConversationBufferMemory

from langchain.chains import ConversationChain

# Add memory to track conversation

memory = ConversationBufferMemory()

conversation = ConversationChain(

    llm=llm,

    memory=memory,

    verbose=True

)

# Multi-turn conversation

conversation.predict(input="Hi, I'm Alice")

conversation.predict(input="What's my name?")  # Remembers "Alice"

RAG (Retrieval-Augmented Generation)

Basic RAG pipeline

from langchain_community.document_loaders import WebBaseLoader

from langchain.text_splitter import RecursiveCharacterTextSplitter

from langchain_openai import OpenAIEmbeddings

from langchain_chroma import Chroma

from langchain.chains import RetrievalQA

# 1. Load documents

loader = WebBaseLoader("https://docs.python.org/3/tutorial/")

docs = loader.load()

# 2. Split into chunks

text_splitter = RecursiveCharacterTextSplitter(

    chunk_size=1000,

    chunk_overlap=200

)

splits = text_splitter.split_documents(docs)

# 3. Create embeddings and vector store

vectorstore = Chroma.from_documents(

    documents=splits,

    embedding=OpenAIEmbeddings()

)

# 4. Create retriever

retriever = vectorstore.as_retriever(search_kwargs={"k": 4})

# 5. Create QA chain

qa_chain = RetrievalQA.from_chain_type(

    llm=llm,

    retriever=retriever,

    return_source_documents=True

)

# 6. Query

result = qa_chain({"query": "What are Python decorators?"})

print(result["result"])

print(f"Sources: {result['source_documents']}")

Conversational RAG with memory

from langchain.chains import ConversationalRetrievalChain

# RAG with conversation memory

qa = ConversationalRetrievalChain.from_llm(

    llm=llm,

    retriever=retriever,

    memory=ConversationBufferMemory(

        memory_key="chat_history",

        return_messages=True

    )

)

# Multi-turn RAG

qa({"question": "What is Python used for?"})

qa({"question": "Can you elaborate on web development?"})  # Remembers context

Advanced agent patterns

Structured output

from langchain_core.pydantic_v1 import BaseModel, Field

# Define schema

class WeatherReport(BaseModel):

    city: str = Field(description="City name")

    temperature: float = Field(description="Temperature in Fahrenheit")

    condition: str = Field(description="Weather condition")

# Get structured response

structured_llm = llm.with_structured_output(WeatherReport)

result = structured_llm.invoke("What's the weather in SF? It's 65F and sunny")

print(result.city, result.temperature, result.condition)

Parallel tool execution

from langchain.agents import create_tool_calling_agent

# Agent automatically parallelizes independent tool calls

agent = create_tool_calling_agent(

    llm=llm,

    tools=[get_weather, search_web, calculator]

)

# This will call get_weather("Paris") and get_weather("London") in parallel

result = agent.invoke({

    "messages": [{"role": "user", "content": "Compare weather in Paris and London"}]

})

Streaming agent execution

# Stream agent steps

for step in agent_executor.stream({"input": "Research AI trends"}):

    if "actions" in step:

        print(f"Tool: {step['actions'][0].tool}")

    if "output" in step:

        print(f"Output: {step['output']}")

Common patterns

Multi-document QA

from langchain.chains.qa_with_sources import load_qa_with_sources_chain

# Load multiple documents

docs = [

    loader.load("https://docs.python.org"),

    loader.load("https://docs.numpy.org")

]

# QA with source citations

chain = load_qa_with_sources_chain(llm, chain_type="stuff")

result = chain({"input_documents": docs, "question": "How to use numpy arrays?"})

print(result["output_text"])  # Includes source citations

Custom tools with error handling

from langchain.tools import tool

@tool

def risky_operation(query: str) -> str:

    """Perform a risky operation that might fail."""

    try:

        # Your operation here

        result = perform_operation(query)

        return f"Success: {result}"

    except Exception as e:

        return f"Error: {str(e)}"

# Agent handles errors gracefully

agent = create_agent(model=llm, tools=[risky_operation])

LangSmith observability

import os

# Enable tracing

os.environ["LANGCHAIN_TRACING_V2"] = "true"

os.environ["LANGCHAIN_API_KEY"] = "your-api-key"

os.environ["LANGCHAIN_PROJECT"] = "my-project"

# All chains/agents automatically traced

agent = create_agent(model=llm, tools=[calculator])

result = agent.invoke({"input": "Calculate 123 * 456"})

# View traces at smith.langchain.com

Vector stores

Chroma (local)

from langchain_chroma import Chroma

vectorstore = Chroma.from_documents(

    documents=docs,

    embedding=OpenAIEmbeddings(),

    persist_directory="./chroma_db"

)

Pinecone (cloud)

from langchain_pinecone import PineconeVectorStore

vectorstore = PineconeVectorStore.from_documents(

    documents=docs,

    embedding=OpenAIEmbeddings(),

    index_name="my-index"

)

FAISS (similarity search)

from langchain_community.vectorstores import FAISS

vectorstore = FAISS.from_documents(docs, OpenAIEmbeddings())

vectorstore.save_local("faiss_index")

# Load later

vectorstore = FAISS.load_local("faiss_index", OpenAIEmbeddings())

Document loaders

# Web pages

from langchain_community.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://example.com")

# PDFs

from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("paper.pdf")

# GitHub

from langchain_community.document_loaders import GithubFileLoader

loader = GithubFileLoader(repo="user/repo", file_filter=lambda x: x.endswith(".py"))

# CSV

from langchain_community.document_loaders import CSVLoader

loader = CSVLoader("data.csv")

Text splitters

# Recursive (recommended for general text)

from langchain.text_splitter import RecursiveCharacterTextSplitter

splitter = RecursiveCharacterTextSplitter(

    chunk_size=1000,

    chunk_overlap=200,

    separators=["\n\n", "\n", " ", ""]

)

# Code-aware

from langchain.text_splitter import PythonCodeTextSplitter

splitter = PythonCodeTextSplitter(chunk_size=500)

# Semantic (by meaning)

from langchain_experimental.text_splitter import SemanticChunker

splitter = SemanticChunker(OpenAIEmbeddings())

Best practices

Start simple - Use create_agent() for most cases

Enable streaming - Better UX for long responses

Add error handling - Tools can fail, handle gracefully

Use LangSmith - Essential for debugging agents

Optimize chunk size - 500-1000 chars for RAG

Version prompts - Track changes in production

Cache embeddings - Expensive, cache when possible

Monitor costs - Track token usage with LangSmith

Performance benchmarks

Operation

Latency

Notes

Simple LLM call

~1-2s

Depends on provider

Agent with 1 tool

~3-5s

ReAct reasoning overhead

RAG retrieval

~0.5-1s

Vector search + LLM

Embedding 1000 docs

~10-30s

Depends on model

LangChain vs LangGraph

Feature

LangChain

LangGraph

Best for

Quick agents, RAG

Complex workflows

Abstraction level

High

Low

Code to start

<10 lines

~30 lines

Control

Simple

Full control

Stateful workflows

Limited

Native

Cyclic graphs

Yes

Human-in-loop

Basic

Advanced

Use LangGraph when:

Need stateful workflows with cycles

Require fine-grained control

Building multi-agent systems

Production apps with complex logic

References

Agents Guide - ReAct, tool calling, streaming

RAG Guide - Document loaders, retrievers, QA chains

Integration Guide - Vector stores, LangSmith, deployment

Resources

GitHub: https://github.com/langchain-ai/langchain ⭐ 119,000+

Docs: https://docs.langchain.com

API Reference: https://reference.langchain.com/python

LangSmith: https://smith.langchain.com (observability)

Version: 0.3+ (stable)

License: MIT

langchain