SKILL.md
Storage Format Guide
Database File Structure
┌─────────────────────────────┐
│ Page 1: Header + Schema │ ← First 100 bytes = DB header
├─────────────────────────────┤
│ Page 2..N: B-tree pages │ ← Tables and indexes
│ Overflow pages │
│ Freelist pages │
└─────────────────────────────┘
Page size: power of 2, 512-65536 bytes. Default 4096.
Database Header (First 100 Bytes)
Offset
Size
Field
0
16
Magic: "SQLite format 3\0"
16
2
Page size (big-endian)
18
1
Write format version (1=rollback, 2=WAL)
19
1
Read format version
24
4
Change counter
28
4
Database size in pages
32
4
First freelist trunk page
36
4
Total freelist pages
40
4
Schema cookie
56
4
Text encoding (1=UTF8, 2=UTF16LE, 3=UTF16BE)
All multi-byte integers: big-endian.
Page Types
Flag
Type
Purpose
0x02
Interior index
Index B-tree internal node
0x05
Interior table
Table B-tree internal node
0x0a
Leaf index
Index B-tree leaf
0x0d
Leaf table
Table B-tree leaf
-
Overflow
Payload exceeding cell capacity
-
Freelist
Unused pages (trunk or leaf)
B-tree Structure
Two B-tree types:
- Table B-tree: 64-bit rowid keys, stores row data
- Index B-tree: Arbitrary keys (index columns + rowid)
Interior page: [ptr0] key1 [ptr1] key2 [ptr2] ...
│ │ │
▼ ▼ ▼
child child child
pages pages pages
Leaf page: key1:data key2:data key3:data ...
Page 1 always root of sqlite_schema table.
Cell Format
Table Leaf Cell
[payload_size: varint] [rowid: varint] [payload] [overflow_ptr: u32?]
Table Interior Cell
[left_child_page: u32] [rowid: varint]
Index Cells
Similar but key is arbitrary (columns + rowid), not just rowid.
Record Format (Payload)
[header_size: varint] [type1: varint] [type2: varint] ... [data1] [data2] ...
Serial types:
Type
Meaning
0
NULL
1-4
1/2/3/4 byte signed int
5
6 byte signed int
6
8 byte signed int
7
IEEE 754 float
8
Integer 0
9
Integer 1
≥12 even
BLOB, length=(N-12)/2
≥13 odd
Text, length=(N-13)/2
Overflow Pages
When payload exceeds threshold, excess stored in overflow chain:
[next_page: u32] [data...]
Last page has next_page=0.
Freelist
Linked list of trunk pages, each containing leaf page numbers:
Trunk: [next_trunk: u32] [leaf_count: u32] [leaf_pages: u32...]
Turso Implementation
Key files:
core/storage/sqlite3_ondisk.rs- On-disk format,PageTypeenum
core/storage/btree.rs- B-tree operations (large file)
core/storage/pager.rs- Page management
core/storage/buffer_pool.rs- Page caching
Debugging Storage
# Integrity check
cargo run --bin tursodb test.db "PRAGMA integrity_check;"
# Page count
cargo run --bin tursodb test.db "PRAGMA page_count;"
# Freelist info
cargo run --bin tursodb test.db "PRAGMA freelist_count;"