chdb-datastore

>-

INSTALLATION
npx skills add https://github.com/clickhouse/agent-skills --skill chdb-datastore
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

$2b

1. "I have a file/database and want to analyze it with pandas"

   → DataStore.from_file() / from_mysql() / from_s3() etc.

   → See references/connectors.md

2. "I need to join data from different sources"

   → Create DataStores from each source, use .join()

   → See examples/examples.md #3-5

3. "My pandas code is too slow"

   → import chdb.datastore as pd — change one line, keep the rest

4. "I need raw SQL queries"

   → Use the chdb-sql skill instead

Connect to Any Data Source — One Pattern

from datastore import DataStore

# Local file (auto-detects .parquet, .csv, .json, .arrow, .orc, .avro, .tsv, .xml)

ds = DataStore.from_file("sales.parquet")

# Database

ds = DataStore.from_mysql(host="db:3306", database="shop", table="orders", user="root", password="pass")

# Cloud storage

ds = DataStore.from_s3("s3://bucket/data.parquet", nosign=True)

# URI shorthand — auto-detects source type

ds = DataStore.uri("mysql://root:pass@db:3306/shop/orders")

All 16+ sources and URI schemes → connectors.md

After Connecting — Full Pandas API

result = ds[ds["age"] > 25]                                          # filter

result = ds[["name", "city"]]                                        # select columns

result = ds.sort_values("revenue", ascending=False)                  # sort

result = ds.groupby("dept")["salary"].mean()                         # groupby

result = ds.assign(margin=lambda x: x["profit"] / x["revenue"])     # computed column

ds["name"].str.upper()                                               # string accessor

ds["date"].dt.year                                                   # datetime accessor

result = ds1.join(ds2, on="id")                                      # join

result = ds.head(10)                                                 # preview

print(ds.to_sql())                                                   # see generated SQL

209 DataFrame methods supported. Full API → api-reference.md

Cross-Source Join — The Killer Feature

from datastore import DataStore

customers = DataStore.from_mysql(host="db:3306", database="crm", table="customers", user="root", password="pass")

orders = DataStore.from_file("orders.parquet")

result = (orders

    .join(customers, left_on="customer_id", right_on="id")

    .groupby("country")

    .agg({"amount": "sum", "rating": "mean"})

    .sort_values("sum", ascending=False))

print(result)

More join examples → examples.md

Writing Data

source = DataStore.from_mysql(host="db:3306", database="shop", table="orders", user="root", password="pass")

target = DataStore("file", path="summary.parquet", format="Parquet")

target.insert_into("category", "total", "count").select_from(

    source.groupby("category").select("category", "sum(amount) AS total", "count() AS count")

).execute()

Troubleshooting

Problem

Fix

ImportError: No module named 'chdb'

pip install chdb

ImportError: cannot import 'DataStore'

Use from datastore import DataStore or from chdb.datastore import DataStore

Database connection timeout

Include port in host: host="db:3306" not host="db"

Join returns empty result

Check key types match (both int or both string); use .to_sql() to inspect

Unexpected results

Call ds.to_sql() to see the generated SQL and debug

Environment check

Run python scripts/verify_install.py (from skill directory)

References

  • Connectors — All 16+ data source connection methods
  • Examples — 10+ runnable examples with expected output

Note: This skill teaches how to use chdb DataStore.

For raw SQL queries, use the chdb-sql skill.

For contributing to chdb source code, see CLAUDE.md in the project root.

BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card