SKILL.md

Databricks Apps Development

Name: databricks-apps
Author: databricks

FIRST: Use the parent databricks-core skill for CLI basics, authentication, and profile selection.

Build apps that deploy to Databricks Apps platform.

Required Reading by Phase

Phase

READ BEFORE proceeding

Scaffolding

⚠️ STOP — complete the Data Access Decision Gate below before scaffolding. Parent databricks-core skill (auth, warehouse discovery); then run databricks apps manifest + databricks apps init with --features and --set (see AppKit section below)

Writing SQL queries

SQL Queries Guide

Writing UI components

Frontend Guide

Using useAnalyticsQuery

AppKit SDK

Adding API endpoints

tRPC Guide

Using Lakebase (OLTP database)

Lakebase Guide

Adding Genie chat / Genie-powered apps

Genie Guide — follow the Genie agent workflow below

Using Model Serving (ML inference)

Model Serving Guide

Typed data contracts (proto-first design)

Proto-First Guide and Plugin Contracts

Managing files in UC Volumes

Files Guide

Triggering / monitoring Lakeflow Jobs from the app

Jobs Guide

Platform rules (permissions, deployment, limits)

Platform Guide — READ for ALL apps including AppKit

Non-AppKit app (Streamlit, FastAPI, Flask, Gradio, Next.js, etc.)

Other Frameworks

Generic Guidelines

App name: ≤26 characters, lowercase letters/numbers/hyphens only (no underscores). dev- prefix adds 4 chars, max 30 total.

Validation: databricks apps validate --profile <PROFILE> before deploying.

Smoke tests (AppKit only): ALWAYS update tests/smoke.spec.ts selectors BEFORE running validation. Default template checks for "Minimal Databricks App" heading and "hello world" text — these WILL fail in your custom app. See testing guide.

Smoke test selectors: use only Playwright locator APIs — getByRole, getByText, getByPlaceholder, getByLabel. getByLabelText does not exist in Playwright (it is a React Testing Library method) and throws TypeError at runtime. See testing guide or npx playwright codegen.

Smoke test data: keep result sets under the 1 MB analytics-event payload cap. Queries returning thousands of rows cause INVALID_REQUEST: Event exceeds max size of 1048576 bytes and net::ERR_ABORTED, leaving every asserted UI element absent. Use LIMIT or an aggregated query (e.g. COUNT(*) GROUP BY status) — never raw row dumps.

AppKit version: never override the @databricks/appkit or @databricks/appkit-ui version in package.json — databricks apps init sets the correct version. Do not run npm install @databricks/appkit@<version> unless explicitly asked by the user. If you need a different version, re-scaffold with databricks apps init --version <version>.

Authentication: covered by parent databricks-core skill.

AppKit API surface: before writing code that calls AppKit APIs (createApp, plugin shapes, useAnalyticsQuery, etc.), run npx @databricks/appkit docs <section> and use the actual signature. Training data has stale shapes; a single invented signature fails tsc --noEmit during validate. The docs ship with the installed AppKit and are the authoritative source.

TypeScript casts: never use as unknown as <T> double-assertions — appkit lint enforces no-double-type-assertion and one violation fails the entire validate step. Instead: narrow with Zod (z.infer<typeof schema>), use a runtime type guard, or write a typed mapper function. If a query result needs reshaping, type the row schema via queryKey types rather than casting.

Project Structure (after databricks apps init --features analytics )

client/src/App.tsx — main React component (start here)

config/queries/*.sql — SQL query files (queryKey = filename without .sql)

server/server.ts — backend entry (tRPC routers)

tests/smoke.spec.ts — smoke test (⚠️ MUST UPDATE selectors for your app)

client/src/appKitTypes.d.ts — auto-generated types (npm run typegen)

Project Structure (after databricks apps init --features lakebase )

server/server.ts — backend with Lakebase pool + tRPC routes

client/src/App.tsx — React frontend

app.yaml — manifest with database resource declaration

package.json — includes @databricks/lakebase dependency

Note: **No config/queries/** — Lakebase apps use pool.query() in tRPC, not SQL files

Data Discovery

Before writing any SQL, use the parent databricks-core skill for data exploration — search information_schema by keyword, then batch discover-schema for the tables you need. Do NOT skip this step.

Development Workflow (FOLLOW THIS ORDER)

Data Access Decision Gate (REQUIRED before scaffolding):

If the app reads from Unity Catalog / lakehouse tables, you MUST show the comparison below to the user and ask them to choose. Do not skip this. Do not choose for them.

(A) Lakebase synced tables

(B) Analytics

Speed

Sub-second responses

Takes a few seconds

Best for

Search, lookups, catalogs, real-time data, operational apps

Dashboards, charts, aggregations, KPIs

How it works

Data synced from Delta into Lakebase Postgres

Queries run on SQL warehouse at read time

After showing the table, add a brief recommendation. Default to recommending Lakebase synced tables (A) unless the use case is clearly about aggregations, charts, or dashboards where seconds of latency is acceptable. For lookups, searches, serving data to users, or any interactive use case, recommend Lakebase synced tables. Always let the user make the final call.

After the user chooses:

(A) Lakebase synced tables → scaffold with --features lakebase. See Lakebase Guide for full workflow.

(B) Analytics → scaffold with --features analytics.

Both → scaffold with --features analytics,lakebase if the app needs both patterns.

If the app does NOT read UC data (pure CRUD, Genie, Model Serving), skip this gate and scaffold with the appropriate --features flag.

Analytics apps (--features analytics):

Create SQL files in config/queries/

Run npm run typegen — verify all queries show ✓

Read client/src/appKitTypes.d.ts to see generated types

THEN write App.tsx using the generated types

Update tests/smoke.spec.ts selectors

Run databricks apps validate --profile <PROFILE>

DO NOT write UI code before running typegen — types won't exist and you'll waste time on compilation errors.

Lakebase apps (--features lakebase): No SQL files or typegen. See Lakebase Guide for the tRPC pattern: initialize schema at startup, write procedures in server/server.ts, then build the React frontend.

When to Use What

After completing the decision gate above, use this routing table:

Read analytics data → display in chart/table: Use visualization components with queryKey prop

Read analytics data → custom display (KPIs, cards): Use useAnalyticsQuery hook

Read analytics data → need computation before display: Still use useAnalyticsQuery, transform client-side

Read lakehouse data at low latency (lookups, search, catalogs): Use Lakebase synced tables — see Lakebase Guide

Read/write persistent data (users, orders, CRUD state): Use Lakebase pool via tRPC — see Lakebase Guide

Natural language query interface over tables (Genie): Use genie() plugin — see Genie Guide

Call ML model endpoint: Use tRPC — see Model Serving Guide

Trigger or monitor a Lakeflow Job from the app: Use the jobs() plugin — see Jobs Guide

⚠️ NEVER use tRPC to run SELECT queries against the warehouse — always use SQL files in config/queries/

**⚠️ NEVER use useAnalyticsQuery for Lakebase data** — it queries the SQL warehouse only

Frameworks

AppKit (Recommended)

TypeScript/React framework with type-safe SQL queries and built-in components.

Official Documentation — the source of truth for all API details:

npx @databricks/appkit docs                              # ← ALWAYS start here to see available pages

npx @databricks/appkit docs <query>                      # view a section by name or doc path

npx @databricks/appkit docs --full                       # full index with all API entries

npx @databricks/appkit docs "appkit-ui API reference"    # example: section by name

npx @databricks/appkit docs ./docs/plugins/analytics.md  # example: specific doc file

DO NOT guess doc paths. Run without args first, pick from the index. The <query> argument accepts both section names (from the index) and file paths. Docs are the authority on component props, hook signatures, and server APIs — skill files only cover anti-patterns and gotchas.

App Manifest and Scaffolding

Agent workflow for scaffolding: get the manifest first, then build the init command.

Get the manifest (JSON schema describing plugins and their resources):

databricks apps manifest --profile <PROFILE>

# See plugins available in a specific AppKit version:

databricks apps manifest --version <VERSION> --profile <PROFILE>

# Custom template:

databricks apps manifest --template <GIT_URL> --profile <PROFILE>

The output defines:

Plugins: each has a key (plugin ID for --features), plus requiredByTemplate, and resources.

requiredByTemplate: If true, that plugin is mandatory for this template — do not add it to --features (it is included automatically); you must still supply all of its required resources via --set. If false or absent, the plugin is optional — add it to --features only when the user's prompt indicates they want that capability (e.g. analytics/SQL), and then supply its required resources via --set.

Resources: Each plugin has resources.required and resources.optional (arrays). Each item has resourceKey and fields (object: field name → description/env). Use --set <plugin>.<resourceKey>.<field>=<value> for each required resource field of every plugin you include.

Scaffold (DO NOT use npx; use the CLI only):

databricks apps init --name <NAME> --features <plugin1>,<plugin2> \

  --set <plugin1>.<resourceKey>.<field>=<value> \

  --set <plugin2>.<resourceKey>.<field>=<value> \

  --description "<DESC>" --run none --profile <PROFILE>

# --run none: skip auto-run after scaffolding (review code first)

# With custom template:

databricks apps init --template <GIT_URL> --name <NAME> --features ... --set ... --profile <PROFILE>

Optionally use --version <VERSION> to target a specific AppKit version.

Required: --name, --profile. Name: ≤26 chars, lowercase letters/numbers/hyphens only. Use --features only for optional plugins the user wants (plugins with requiredByTemplate: false or absent); mandatory plugins must not be listed in --features.

Resources: Pass --set for every required resource (each field in resources.required) for (1) all plugins with requiredByTemplate: true, and (2) any optional plugins you added to --features. Add --set for resources.optional only when the user requests them.

Discovery: Use the parent databricks-core skill to resolve IDs (e.g. warehouse: databricks warehouses list --profile <PROFILE> or databricks experimental aitools tools get-default-warehouse --profile <PROFILE>).

DO NOT guess plugin names, resource keys, or property names — always derive them from databricks apps manifest output. Example: if the manifest shows plugin analytics with a required resource resourceKey: "sql-warehouse" and fields: { "id": ... }, include --set analytics.sql-warehouse.id=<ID>.

READ AppKit Overview for project structure, workflow, and pre-implementation checklist.

Genie Agent Workflow — when the user wants a Genie-powered app, do not start by asking for a Genie Space ID. Instead:

Ask which Unity Catalog tables the app should query (fully qualified: catalog.schema.table).

Ask whether to reuse an existing Genie space or create a new one.

If creating: discover the warehouse, then create the space with databricks genie create-space (see Genie Guide for syntax and serialized space format).

If reusing: discover existing spaces with databricks genie list-spaces --profile <PROFILE> and let the user pick.

Scaffold or wire the space ID into the app — derive --set keys from databricks apps manifest.

Read the Genie Guide for configuration, SSE endpoints, and frontend integration.

Common Scaffolding Mistakes

# ❌ WRONG: name is NOT a positional argument

databricks apps init --features analytics my-app-name

# → "unknown command" error

# ✅ CORRECT: use --name flag

databricks apps init --name my-app-name --features analytics --set "..." --profile <PROFILE>

Directory Naming

databricks apps init creates directories in kebab-case matching the app name.

App names must be lowercase with hyphens only (≤26 chars).

Other Frameworks (Streamlit, FastAPI, Flask, Gradio, Dash, Next.js, etc.)

Databricks Apps supports any framework that runs as an HTTP server. LLMs already know these frameworks — the challenge is Databricks platform integration.

READ Other Frameworks Guide BEFORE building any non-AppKit app. It covers port/host configuration, app.yaml and databricks.yml setup, dependency management, networking, and framework-specific gotchas.

databricks-apps