SKILL.md

$27

Architecture decision tree:

Periodic alerts/reports?  → Scheduled Task

Live visual interface?    → Preview Server (dashboard)

One-time analysis?        → Inline (no build needed)

Reusable tool?            → Script in workspace

For medium+ projects, present to user BEFORE writing code:

Data flow — sources → processing → output

Architecture choice and why

Cost estimate — (cost/run) × frequency × 30 = monthly

Known limitations

Design Gate (required, blocking):

After Phase 1, STOP and present a short phase plan (milestones for DESIGN/BUILD/DEBUG). Ask explicitly: "Approve this plan and proceed to Phase 2 BUILD?" Match the user's language when phrasing the question — never inject a hardcoded non-English string.

If user confirms: proceed to Phase 2.

If user requests changes: revise design and re-confirm.

If no confirmation: do not write/modify code.

Phase 1.5: SCAFFOLD (mandatory for shareable projects)

After design is confirmed, before writing any code, scaffold the project under the standard layout. This makes the project shareable via community-publish skill from day one — no migration later.

Standard project location: output/projects/{slug}/

output/projects/{slug}/

├── project.yaml          # name, version (start 0.1.0), type, description, license, entry, env_required

├── PROJECT.md            # 4 required sections: What / Required env / How to start / Outputs / Troubleshooting

├── .env.example          # every env var the code reads, with placeholder values

├── .gitignore            # at minimum: .env, *.key, *.pem, __pycache__, node_modules

└── src/                  # all code lives here, NOT scattered

    ├── run.py            # type=task — first line MUST be: # -*- task-system: v3 -*-

    ├── server.py         # type=service

    ├── main.py           # type=script

    └── index.html / app.py + frontend  # type=preview

Project type → entry mapping:

Architecture choice

type

entry path

Scheduled Task

task

src/run.py

Preview Server

preview

src/index.html (static) or src/app.py

Background daemon

service

src/server.py

One-shot tool

script

src/main.py

Skip scaffold only when:

Pure inline analysis with no persistent code

Modifying an existing output/projects/... project (keep its layout)

User explicitly says "just throw a script in /tmp" or similar

During Phase 2 BUILD, maintain the scaffold:

Every new env var read by code → add to .env.example in same edit

Every behavioral change → update PROJECT.md

Never write code outside src/ (configs, fixtures: project root or src/data/)

Why this matters: Projects already in standard layout publish in one command. Projects scattered across tasks/, output/scripts/, dashboards/, etc. need tidy_project() migration before they can be shared, and the user often doesn't want to rebuild PROJECT.md from memory.

For existing scattered code: call community-publish skill → tidy_project(any_dir) to reorganize before publishing.

API cost & rate limits:

All external API calls go through sc-proxy, which bills per request and enforces rate limits.

Before designing, **read config/context/references/sc-proxy.md** for pricing table and limits.

Estimate cost: credits_per_request × requests_per_run × runs_per_day × 30

Respect rate limits: e.g. CoinGecko 60 req/min — a task polling 10 coins every minute is fine; 100 coins is not

Prefer batch endpoints over N single calls (e.g. coin_price with multiple ids vs N separate calls)

Pure script tasks (no API): ~0 credits/run

LLM cost warning: high-end models can exceed $0.10 per single call. Pricing varies dramatically by model tier; expensive models can be 100x+ the cost of budget models for the same workflow.

Model-aware estimate required: break LLM cost down by model (model_price_per_call × expected_calls_per_run × runs_per_day × 30) instead of using a single generic number.

Dashboard auto-refresh costs credits — default to manual refresh unless user asks otherwise

Spending protection: if projected monthly LLM cost is high, explicitly ask whether to enforce per-caller limits before implementation.

Per-caller tracking (required): every proxied request must include SC-CALLER-ID (e.g. job:{JOB_ID}, preview:{preview_id}, chat:{thread_id}) so usage can be traced and capped. Details in config/context/references/sc-proxy.md § Caller Credit Limit

Data reliability: Native tools > proxied APIs > direct requests > web scraping > LLM numbers (never).

Iron rule: Scripts fetch data. LLMs analyze text. Final output = script variables + LLM prose.

Task scripts can import skill functions directly:

from core.skill_tools import coingecko, coinglass  # auto-discovers skills/*/exports.py

prices = coingecko.coin_price(coin_ids=["bitcoin"], timestamps=["now"])

Tool names = SKILL.md frontmatter tools: list. See build-patterns.md § Using Skill Functions.

Phase 2: BUILD

Every piece follows this cycle:

Build one small piece → Run it → Verify output → ✅ Next piece / ❌ Fix first

Built

Verify how

Pass

Data fetcher

Run, print raw response

Non-empty, recent, plausible

API endpoint

curl localhost:{port}/api/...

Correct JSON

HTML page

preview_serve + preview_check

ok = true

Task script

python3 tasks/{id}/run.py

Numbers match source

LLM analysis

Numbers from script vars, not LLM text

Template pattern used

Verification layering:

Critical (must pass before preview/activate): data correctness, core logic, no crashes

Informational (can fix after delivery): styling, edge case messages, minor UX polish

Anti-patterns:

❌ "Done!" without running anything

❌ Writing 200+ lines then testing for the first time

❌ "It should work"

→ Detailed patterns: **read references/build-patterns.md**

Code Practices

read_file before edit_file — understand what's there

edit_file > write_file for modifications

Check ls before write_file — avoid duplicating existing files

Large files (>300 lines): split into multiple files, or skeleton-first + bash inject

Env vars: os.environ["KEY"], persist installs to setup.sh

Dashboard UX Defaults ( type=preview )

Decide sensible defaults yourself and render real data on first load. Treat filters as optional refinements users can adjust later — never as prerequisites that gate the initial view. Auto-refresh on a sensible interval. No "Click to load" / "Enter address" / "Select symbol" before anything appears.

Platform Rules

Agent tools are tool calls only — not importable in scripts

Preview paths must be relative (./path not /path)

Hardcode the preview port in code, do not read from env. Each preview runs in its own pod and the env-port contract is not reliable across pods. Pick any free port (e.g. 8765), write it directly into the app, and pass the same number to preview(action="serve", port=...). The two must match exactly.

Concurrent previews need different IDs. If two previews share the same dir, the newer one auto-kills the older one (same-dir replacement rule). When iterating, reuse the same id rather than inventing variants, or use distinct dirs.

Fullstack = one port (backend serves API + static files)

Cron times are UTC — convert from user timezone

Preview serving & publishing → read platform reference config/context/references/preview-guide.md

localhost APIs → read config/context/references/localhost-api.md

Task scripts decide WHEN to invoke the agent, WHAT data/context to pass, WHICH model to use

Pattern: script fetches data → evaluates if noteworthy → calls LLM only when needed → prints result

LLM in scripts — two options (details in references/build-patterns.md):

OpenRouter (via sc-proxy): lightweight, for summarize/translate/format text. Direct API call, no agent overhead.

localhost /chat/stream: full agent with tools. Use only when LLM needs tool access.

Data template rule: Script owns the numbers, LLM owns the words. Final output assembles data from script variables + analysis from LLM. Never let LLM output be the sole source of numbers the user sees.

API costs & rate limits → read platform reference config/context/references/sc-proxy.md

Phase 3: DEBUG

CHECK LOGS → REPRODUCE → ISOLATE → DIAGNOSE → FIX → VERIFY → REGRESS

CHECK LOGS first — task logs, preview diagnostics, stderr. If logs reveal a clear cause, skip to FIX.

REPRODUCE only when logs are insufficient — see the failure yourself

ISOLATE which layer is broken (data? logic? LLM? output? frontend? backend?)

FIX the root cause, then VERIFY with the same repro steps. Don't just fix — fix and confirm.

Three-Strike Rule: Same approach fails twice → STOP → rethink → explain to user → different approach.

→ Full debug procedures: **read references/debug-handbook.md**

Quick Checklists

Kickoff: ☐ Clarified intent ☐ Proposed architecture ☐ Estimated cost ☐ User confirmed (required before Phase 2)

Build: ☐ Each component tested ☐ Numbers match source ☐ Errors handled ☐ Preview healthy (web)

Debug: ☐ Logs checked ☐ Reproduced (or skipped — logs sufficient) ☐ Isolated layer ☐ Root cause found ☐ Fix verified ☐ Regressions checked

project-builder

SKILL.md

Phase 1.5: SCAFFOLD (mandatory for shareable projects)

Phase 2: BUILD

Code Practices

Dashboard UX Defaults ( type=preview )

Platform Rules

Phase 3: DEBUG

Quick Checklists

Stop writing automation&scrapers

project-builder

SKILL.md

Phase 1.5: SCAFFOLD (mandatory for shareable projects)

Phase 2: BUILD

Code Practices

Dashboard UX Defaults ( type=preview )

Platform Rules

Phase 3: DEBUG

Quick Checklists

Let your agent run on any real-world website

Related skills

Stop writing automation&scrapers