faiss

Facebook's library for efficient similarity search and clustering of dense vectors. Supports billions of vectors, GPU acceleration, and various index types…

INSTALLATION
npx skills add https://github.com/davila7/claude-code-templates --skill faiss
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

FAISS - Efficient Similarity Search

Facebook AI's library for billion-scale vector similarity search.

When to use FAISS

Use FAISS when:

  • Need fast similarity search on large vector datasets (millions/billions)
  • GPU acceleration required
  • Pure vector similarity (no metadata filtering needed)
  • High throughput, low latency critical
  • Offline/batch processing of embeddings

Metrics:

  • 31,700+ GitHub stars
  • Meta/Facebook AI Research
  • Handles billions of vectors
  • C++ with Python bindings

Use alternatives instead:

  • Chroma/Pinecone: Need metadata filtering
  • Weaviate: Need full database features
  • Annoy: Simpler, fewer features

Quick start

Installation

# CPU only

pip install faiss-cpu

# GPU support

pip install faiss-gpu

Basic usage

import faiss

import numpy as np

# Create sample data (1000 vectors, 128 dimensions)

d = 128

nb = 1000

vectors = np.random.random((nb, d)).astype('float32')

# Create index

index = faiss.IndexFlatL2(d)  # L2 distance

index.add(vectors)             # Add vectors

# Search

k = 5  # Find 5 nearest neighbors

query = np.random.random((1, d)).astype('float32')

distances, indices = index.search(query, k)

print(f"Nearest neighbors: {indices}")

print(f"Distances: {distances}")

Index types

1. Flat (exact search)

# L2 (Euclidean) distance

index = faiss.IndexFlatL2(d)

# Inner product (cosine similarity if normalized)

index = faiss.IndexFlatIP(d)

# Slowest, most accurate

2. IVF (inverted file) - Fast approximate

# Create quantizer

quantizer = faiss.IndexFlatL2(d)

# IVF index with 100 clusters

nlist = 100

index = faiss.IndexIVFFlat(quantizer, d, nlist)

# Train on data

index.train(vectors)

# Add vectors

index.add(vectors)

# Search (nprobe = clusters to search)

index.nprobe = 10

distances, indices = index.search(query, k)

3. HNSW (Hierarchical NSW) - Best quality/speed

# HNSW index

M = 32  # Number of connections per layer

index = faiss.IndexHNSWFlat(d, M)

# No training needed

index.add(vectors)

# Search

distances, indices = index.search(query, k)

4. Product Quantization - Memory efficient

# PQ reduces memory by 16-32×

m = 8   # Number of subquantizers

nbits = 8

index = faiss.IndexPQ(d, m, nbits)

# Train and add

index.train(vectors)

index.add(vectors)

Save and load

# Save index

faiss.write_index(index, "large.index")

# Load index

index = faiss.read_index("large.index")

# Continue using

distances, indices = index.search(query, k)

GPU acceleration

# Single GPU

res = faiss.StandardGpuResources()

index_cpu = faiss.IndexFlatL2(d)

index_gpu = faiss.index_cpu_to_gpu(res, 0, index_cpu)  # GPU 0

# Multi-GPU

index_gpu = faiss.index_cpu_to_all_gpus(index_cpu)

# 10-100× faster than CPU

LangChain integration

from langchain_community.vectorstores import FAISS

from langchain_openai import OpenAIEmbeddings

# Create FAISS vector store

vectorstore = FAISS.from_documents(docs, OpenAIEmbeddings())

# Save

vectorstore.save_local("faiss_index")

# Load

vectorstore = FAISS.load_local(

    "faiss_index",

    OpenAIEmbeddings(),

    allow_dangerous_deserialization=True

)

# Search

results = vectorstore.similarity_search("query", k=5)

LlamaIndex integration

from llama_index.vector_stores.faiss import FaissVectorStore

import faiss

# Create FAISS index

d = 1536

faiss_index = faiss.IndexFlatL2(d)

vector_store = FaissVectorStore(faiss_index=faiss_index)

Best practices

  • Choose right index type - Flat for <10K, IVF for 10K-1M, HNSW for quality
  • Normalize for cosine - Use IndexFlatIP with normalized vectors
  • Use GPU for large datasets - 10-100× faster
  • Save trained indices - Training is expensive
  • Tune nprobe/ef_search - Balance speed/accuracy
  • Monitor memory - PQ for large datasets
  • Batch queries - Better GPU utilization

Performance

Index Type

Build Time

Search Time

Memory

Accuracy

Flat

Fast

Slow

High

100%

IVF

Medium

Fast

Medium

95-99%

HNSW

Slow

Fastest

High

99%

PQ

Medium

Fast

Low

90-95%

Resources

  • License: MIT
BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card