SKILL.md

Outlines: Structured Text Generation

When to Use This Skill

Use Outlines when you need to:

Guarantee valid JSON/XML/code structure during generation

Use Pydantic models for type-safe outputs

Support local models (Transformers, llama.cpp, vLLM)

Maximize inference speed with zero-overhead structured generation

Generate against JSON schemas automatically

Control token sampling at the grammar level

GitHub Stars: 8,000+ | From: dottxt.ai (formerly .txt)

Installation

# Base installation

pip install outlines

With specific backends

pip install outlines transformers # Hugging Face models

pip install outlines llama-cpp-python # llama.cpp

pip install outlines vllm # vLLM for high-throughput

## Quick Start

### Basic Example: Classification

import outlines

from typing import Literal

Load model

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")

Generate with type constraint

prompt = "Sentiment of 'This product is amazing!': "

generator = outlines.generate.choice(model, ["positive", "negative", "neutral"])

sentiment = generator(prompt)

print(sentiment) # "positive" (guaranteed one of these)


### With Pydantic Models

from pydantic import BaseModel

import outlines

class User(BaseModel):

age: int

email: str

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")

Generate structured output

prompt = "Extract user: John Doe, 30 years old, john@example.com"

generator = outlines.generate.json(model, User)

user = generator(prompt)

print(user.name) # "John Doe"

print(user.age) # 30

print(user.email) # "john@example.com"


## Core Concepts

### 1. Constrained Token Sampling

Outlines uses Finite State Machines (FSM) to constrain token generation at the logit level.

**How it works:**

- Convert schema (JSON/Pydantic/regex) to context-free grammar (CFG)

- Transform CFG into Finite State Machine (FSM)

- Filter invalid tokens at each step during generation

- Fast-forward when only one valid token exists

**Benefits:**

- **Zero overhead**: Filtering happens at token level

- **Speed improvement**: Fast-forward through deterministic paths

- **Guaranteed validity**: Invalid outputs impossible

import outlines

Pydantic model -> JSON schema -> CFG -> FSM

class Person(BaseModel):

age: int

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")

Behind the scenes:

1. Person -> JSON schema

2. JSON schema -> CFG

3. CFG -> FSM

4. FSM filters tokens during generation

generator = outlines.generate.json(model, Person)

result = generator("Generate person: Alice, 25")


### 2. Structured Generators

Outlines provides specialized generators for different output types.

#### Choice Generator

Multiple choice selection

generator = outlines.generate.choice(

model,

["positive", "negative", "neutral"]

)

sentiment = generator("Review: This is great!")

Result: One of the three choices


#### JSON Generator

from pydantic import BaseModel

class Product(BaseModel):

price: float

in_stock: bool

Generate valid JSON matching schema

generator = outlines.generate.json(model, Product)

product = generator("Extract: iPhone 15, $999, available")

Guaranteed valid Product instance

print(type(product)) # <class '__main__.Product'>


#### Regex Generator

Generate text matching regex

generator = outlines.generate.regex(

model,

r"[0-9]{3}-[0-9]{3}-[0-9]{4}" # Phone number pattern

)

phone = generator("Generate phone number:")

Result: "555-123-4567" (guaranteed to match pattern)


#### Integer/Float Generators

Generate specific numeric types

int_generator = outlines.generate.integer(model)

age = int_generator("Person's age:") # Guaranteed integer

float_generator = outlines.generate.float(model)

price = float_generator("Product price:") # Guaranteed float


### 3. Model Backends

Outlines supports multiple local and API-based backends.

#### Transformers (Hugging Face)

import outlines

Load from Hugging Face

model = outlines.models.transformers(

"microsoft/Phi-3-mini-4k-instruct",

device="cuda" # Or "cpu"

)

Use with any generator

generator = outlines.generate.json(model, YourModel)


#### llama.cpp

Load GGUF model

model = outlines.models.llamacpp(

"./models/llama-3.1-8b-instruct.Q4_K_M.gguf",

n_gpu_layers=35

)

generator = outlines.generate.json(model, YourModel)


#### vLLM (High Throughput)

For production deployments

model = outlines.models.vllm(

"meta-llama/Llama-3.1-8B-Instruct",

tensor_parallel_size=2 # Multi-GPU

)

generator = outlines.generate.json(model, YourModel)


#### OpenAI (Limited Support)

Basic OpenAI support

model = outlines.models.openai(

"gpt-4o-mini",

api_key="your-api-key"

)

Note: Some features limited with API models

generator = outlines.generate.json(model, YourModel)


### 4. Pydantic Integration

Outlines has first-class Pydantic support with automatic schema translation.

#### Basic Models

from pydantic import BaseModel, Field

class Article(BaseModel):

title: str = Field(description="Article title")

author: str = Field(description="Author name")

word_count: int = Field(description="Number of words", gt=0)

tags: list[str] = Field(description="List of tags")

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")

generator = outlines.generate.json(model, Article)

article = generator("Generate article about AI")

print(article.title)

print(article.word_count) # Guaranteed > 0


#### Nested Models

class Address(BaseModel):

street: str

city: str

country: str

class Person(BaseModel):

age: int

address: Address # Nested model

generator = outlines.generate.json(model, Person)

person = generator("Generate person in New York")

print(person.address.city) # "New York"


#### Enums and Literals

from enum import Enum

from typing import Literal

class Status(str, Enum):

PENDING = "pending"

APPROVED = "approved"

REJECTED = "rejected"

class Application(BaseModel):

applicant: str

status: Status # Must be one of enum values

priority: Literal["low", "medium", "high"] # Must be one of literals

generator = outlines.generate.json(model, Application)

app = generator("Generate application")

print(app.status) # Status.PENDING (or APPROVED/REJECTED)


## Common Patterns

### Pattern 1: Data Extraction

from pydantic import BaseModel

import outlines

class CompanyInfo(BaseModel):

founded_year: int

industry: str

employees: int

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")

generator = outlines.generate.json(model, CompanyInfo)

text = """

Apple Inc. was founded in 1976 in the technology industry.

The company employs approximately 164,000 people worldwide.

"""

prompt = f"Extract company information:\n{text}\n\nCompany:"

company = generator(prompt)

print(f"Name: {company.name}")

print(f"Founded: {company.founded_year}")

print(f"Industry: {company.industry}")

print(f"Employees: {company.employees}")


### Pattern 2: Classification

from typing import Literal

import outlines

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")

Binary classification

generator = outlines.generate.choice(model, ["spam", "not_spam"])

result = generator("Email: Buy now! 50% off!")

Multi-class classification

categories = ["technology", "business", "sports", "entertainment"]

category_gen = outlines.generate.choice(model, categories)

category = category_gen("Article: Apple announces new iPhone...")

With confidence

class Classification(BaseModel):

label: Literal["positive", "negative", "neutral"]

confidence: float

classifier = outlines.generate.json(model, Classification)

result = classifier("Review: This product is okay, nothing special")


### Pattern 3: Structured Forms

class UserProfile(BaseModel):

full_name: str

age: int

email: str

phone: str

country: str

interests: list[str]

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")

generator = outlines.generate.json(model, UserProfile)

prompt = """

Extract user profile from:

Name: Alice Johnson

Age: 28

Email: alice@example.com

Phone: 555-0123

Country: USA

Interests: hiking, photography, cooking

"""

profile = generator(prompt)

print(profile.full_name)

print(profile.interests) # ["hiking", "photography", "cooking"]


### Pattern 4: Multi-Entity Extraction

class Entity(BaseModel):

type: Literal["PERSON", "ORGANIZATION", "LOCATION"]

class DocumentEntities(BaseModel):

entities: list[Entity]

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")

generator = outlines.generate.json(model, DocumentEntities)

text = "Tim Cook met with Satya Nadella at Microsoft headquarters in Redmond."

prompt = f"Extract entities from: {text}"

result = generator(prompt)

for entity in result.entities:

print(f"{entity.name} ({entity.type})")


### Pattern 5: Code Generation

class PythonFunction(BaseModel):

function_name: str

parameters: list[str]

docstring: str

body: str

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")

generator = outlines.generate.json(model, PythonFunction)

prompt = "Generate a Python function to calculate factorial"

func = generator(prompt)

print(f"def {func.function_name}({', '.join(func.parameters)}):")

print(f' """{func.docstring}"""')

print(f" {func.body}")


### Pattern 6: Batch Processing

def batch_extract(texts: list[str], schema: type[BaseModel]):

"""Extract structured data from multiple texts."""

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")

generator = outlines.generate.json(model, schema)

results = []

for text in texts:

result = generator(f"Extract from: {text}")

results.append(result)

return results

class Person(BaseModel):

age: int

texts = [

"John is 30 years old",

"Alice is 25 years old",

"Bob is 40 years old"

]

people = batch_extract(texts, Person)

for person in people:

print(f"{person.name}: {person.age}")


## Backend Configuration

### Transformers

import outlines

Basic usage

model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")

GPU configuration

model = outlines.models.transformers(

"microsoft/Phi-3-mini-4k-instruct",

device="cuda",

model_kwargs={"torch_dtype": "float16"}

)

Popular models

model = outlines.models.transformers("meta-llama/Llama-3.1-8B-Instruct")

model = outlines.models.transformers("mistralai/Mistral-7B-Instruct-v0.3")

model = outlines.models.transformers("Qwen/Qwen2.5-7B-Instruct")


### llama.cpp

Load GGUF model

model = outlines.models.llamacpp(

"./models/llama-3.1-8b.Q4_K_M.gguf",

n_ctx=4096, # Context window

n_gpu_layers=35, # GPU layers

n_threads=8 # CPU threads

)

Full GPU offload

model = outlines.models.llamacpp(

"./models/model.gguf",

n_gpu_layers=-1 # All layers on GPU

)


### vLLM (Production)

Single GPU

model = outlines.models.vllm("meta-llama/Llama-3.1-8B-Instruct")

Multi-GPU

model = outlines.models.vllm(

"meta-llama/Llama-3.1-70B-Instruct",

tensor_parallel_size=4 # 4 GPUs

)

With quantization

model = outlines.models.vllm(

"meta-llama/Llama-3.1-8B-Instruct",

quantization="awq" # Or "gptq"

)


## Best Practices

### 1. Use Specific Types

✅ Good: Specific types

class Product(BaseModel):

price: float # Not str

quantity: int # Not str

in_stock: bool # Not str

❌ Bad: Everything as string

class Product(BaseModel):

price: str # Should be float

quantity: str # Should be int


### 2. Add Constraints

from pydantic import Field

✅ Good: With constraints

class User(BaseModel):

age: int = Field(ge=0, le=120)

email: str = Field(pattern=r"^[\w\.-]+@[\w\.-]+\.\w+$")

❌ Bad: No constraints

class User(BaseModel):

age: int

email: str


### 3. Use Enums for Categories

✅ Good: Enum for fixed set

class Priority(str, Enum):

LOW = "low"

MEDIUM = "medium"

HIGH = "high"

class Task(BaseModel):

title: str

priority: Priority

❌ Bad: Free-form string

class Task(BaseModel):

title: str

priority: str # Can be anything


### 4. Provide Context in Prompts

✅ Good: Clear context

prompt = """

Extract product information from the following text.

Text: iPhone 15 Pro costs $999 and is currently in stock.

Product:

"""

❌ Bad: Minimal context

prompt = "iPhone 15 Pro costs $999 and is currently in stock."


### 5. Handle Optional Fields

from typing import Optional

✅ Good: Optional fields for incomplete data

class Article(BaseModel):

title: str # Required

author: Optional[str] = None # Optional

date: Optional[str] = None # Optional

tags: list[str] = [] # Default empty list

Can succeed even if author/date missing

outlines

SKILL.md

Outlines: Structured Text Generation

When to Use This Skill

Installation

With specific backends

Load model

Generate with type constraint

Generate structured output

Pydantic model -> JSON schema -> CFG -> FSM

Behind the scenes:

1. Person -> JSON schema

2. JSON schema -> CFG

3. CFG -> FSM

4. FSM filters tokens during generation

Multiple choice selection

Result: One of the three choices

Generate valid JSON matching schema

Guaranteed valid Product instance

Generate text matching regex

Result: "555-123-4567" (guaranteed to match pattern)

Generate specific numeric types

Load from Hugging Face

Use with any generator

Load GGUF model

For production deployments

Basic OpenAI support

Note: Some features limited with API models

Binary classification

Multi-class classification

With confidence

Basic usage

GPU configuration

Popular models

Load GGUF model

Full GPU offload

Single GPU

Multi-GPU

With quantization

✅ Good: Specific types

❌ Bad: Everything as string

✅ Good: With constraints

❌ Bad: No constraints

✅ Good: Enum for fixed set

❌ Bad: Free-form string

✅ Good: Clear context

❌ Bad: Minimal context

✅ Good: Optional fields for incomplete data

Can succeed even if author/date missing

Let your agent run on any real-world website

Related skills

Stop writing automation&scrapers