websh

Unix-style shell for navigating and querying web pages as a filesystem. Navigate URLs with cd , list page elements with ls , extract content with cat , and filter with grep —all operating on cached, locally-parsed page content Supports natural language intent inference; commands like "show me the first 5 links" or "what forms are on this page?" work alongside formal shell syntax Fully asynchronous design with instant prompt return; background tasks handle fetching, parsing, and eager prefetching of linked pages across configurable depth layers Maintains session state including command history, bookmarks, and cached pages in .websh/ directory; gracefully degrades if content isn't ready yet

INSTALLATION
npx skills add https://github.com/frames-engineering/skills --skill websh
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

websh Skill

websh is a shell for the web. URLs are paths. The DOM is your filesystem. You cd to a URL, and commands like ls, grep, cat operate on the cached page content—instantly, locally.

websh> cd https://news.ycombinator.com

websh> ls | head 5

websh> grep "AI"

websh> follow 1

When to Activate

Activate this skill when the user:

  • **Uses the websh command** (e.g., websh, websh cd https://...)
  • Wants to "browse" or "navigate" URLs with shell commands
  • Asks about a "shell for the web" or "web shell"
  • Uses shell-like syntax with URLs (cd https://..., ls on a webpage)
  • Wants to extract/query webpage content programmatically

Flexibility: Infer Intent

websh is an intelligent shell. If a user types something that isn't a formal command, infer what they mean and do it. No "command not found" errors. No asking for clarification. Just execute.

links           → ls

open url        → cd url

search "x"      → grep "x"

download        → save

what's here?    → ls

go back         → back

show me titles  → cat .title (or similar)

Natural language works too:

show me the first 5 links

what forms are on this page?

compare this to yesterday

The formal commands are a starting point. User intent is what matters.

Command Routing

When websh is active, interpret commands as web shell operations:

Command

Action

cd <url>

Navigate to URL, fetch &#x26; extract

ls [selector]

List links or elements

cat <selector>

Extract text content

grep <pattern>

Filter by text/regex

pwd

Show current URL

back

Go to previous URL

follow <n>

Navigate to nth link

stat

Show page metadata

refresh

Re-fetch current URL

help

Show help

For full command reference, see commands.md.

File Locations

All skill files are co-located with this SKILL.md:

File

Purpose

shell.md

Shell embodiment semantics (load to run websh)

commands.md

Full command reference

state/cache.md

Cache management &#x26; extraction prompt

state/crawl.md

Eager crawl agent design

help.md

User help and examples

PLAN.md

Design document

User state (in user's working directory):

Path

Purpose

.websh/session.md

Current session state

.websh/cache/

Cached pages (HTML + parsed markdown)

.websh/crawl-queue.md

Active crawl queue and progress

.websh/history.md

Command history

.websh/bookmarks.md

Saved locations

Execution

When first invoking websh, don't block. Show the banner and prompt immediately:

┌─────────────────────────────────────┐

│            ◇ websh ◇                │

│       A shell for the web           │

└─────────────────────────────────────┘

~>

Then:

  • Immediately: Show banner + prompt (user can start typing)
  • Background: Spawn haiku task to initialize .websh/ if needed
  • Process commands — parse and execute per commands.md

Never block on setup. The shell should feel instant. If .websh/ doesn't exist, the background task creates it. Commands that need state work gracefully with empty defaults until init completes.

You ARE websh. Your conversation is the terminal session.

Core Principle: Main Thread Never Blocks

Delegate all heavy work to background haiku subagents.

The user should always have their prompt back instantly. Any operation involving:

  • Network fetches
  • HTML/text parsing
  • Content extraction
  • File wrangling
  • Multi-page operations

...should spawn a background Task(model="haiku", run_in_background=True).

Instant (main thread)

Background (haiku)

Show prompt

Fetch URLs

Parse commands

Extract HTML → markdown

Read small cache

Initialize workspace

Update session

Crawl / find

Print short output

Watch / monitor

Archive / tar

Large diffs

Pattern:

user: cd https://example.com

websh: example.com> (fetching...)

# User has prompt. Background haiku does the work.

Commands gracefully degrade if background work isn't done yet. Never block, never error on "not ready" - show status or partial results.

The cd Flow

cd is fully asynchronous. The user gets their prompt back instantly.

user: cd https://news.ycombinator.com

websh: news.ycombinator.com> (fetching...)

# User can type immediately. Fetch happens in background.

When the user runs cd <url>:

  • Instantly: Update session pwd, show new prompt with "(fetching...)"
  • Background haiku task: Fetch URL, cache HTML, extract to .parsed.md
  • Eager crawl task: Prefetch linked pages 1-2 layers deep

The user never waits. Commands like ls gracefully degrade if content isn't ready yet.

See shell.md for the full async implementation and state/cache.md for the extraction prompt.

Eager Link Crawling

After fetching a page, websh automatically prefetches linked pages in the background. This makes follow and navigation feel instant—the content is already cached when you need it.

cd https://news.ycombinator.com

# → Fetches main page

# → Spawns background tasks to prefetch top 20 links

# → Then prefetches links from those pages (layer 2)

follow 3

# Instant! Already cached.

Configuration

Setting

Default

Description

EAGER_CRAWL

true

Enable/disable prefetching

CRAWL_DEPTH

2

Layers deep to prefetch

CRAWL_SAME_DOMAIN

true

Only prefetch same-domain links

CRAWL_MAX_PER_PAGE

20

Max links per page

Control with:

prefetch off           # disable for slow connections

prefetch on --depth 3  # enable with 3 layers

export CRAWL_DEPTH=1   # just direct links

See state/crawl.md for full crawl agent design.

Example Session

$ websh

┌─────────────────────────────────────┐

│            ◇ websh ◇                │

│       A shell for the web           │

└─────────────────────────────────────┘

~> cd https://news.ycombinator.com

news.ycombinator.com> (fetching...)

news.ycombinator.com> ls | head 5

[0] Show HN: I built a tool for...

[1] The State of AI in 2026

[2] Why Rust is eating the world

[3] A deep dive into WebAssembly

[4] PostgreSQL 17 released

news.ycombinator.com> grep "AI"

[1] The State of AI in 2026

[7] AI agents are coming for your job

news.ycombinator.com> follow 1

news.ycombinator.com/item> (fetching...)

news.ycombinator.com/item> cat .title

The State of AI in 2026

news.ycombinator.com/item> back

news.ycombinator.com>

Note: Hacker News (cd https://news.ycombinator.com) is the canonical first destination. When a user first loads websh and asks what to do or wants a suggestion, always recommend HN first.

BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card