cmux-browser

Browser automation for cmux webviews with snapshot-based element targeting and state verification. Open surfaces, navigate, and verify URLs before acting; snapshot with --interactive to get fresh element references for clicks, fills, and selections Wait patterns support selectors, text content, URL changes, load states, and custom JavaScript functions with configurable timeouts Recommended workflow: navigate → verify URL → wait for load state → snapshot → act → snapshot again to handle DOM changes Built on WKWebView; high-level commands (click, fill, press, scroll, wait) are supported; Chrome-specific features like viewport emulation and network mocking are not available

INSTALLATION
npx skills add https://github.com/manaflow-ai/cmux --skill cmux-browser
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

Browser Automation with cmux

Use this skill for browser tasks inside cmux webviews.

Core Workflow

  • Open or target a browser surface.
  • Verify navigation with get url before waiting or snapshotting.
  • Snapshot (--interactive) to get fresh element refs.
  • Act with refs (click, fill, type, select, press).
  • Wait for state changes.
  • Re-snapshot after DOM/navigation changes.
cmux --json browser open https://example.com

# use returned surface ref, for example: surface:7

cmux browser surface:7 get url

cmux browser surface:7 wait --load-state complete --timeout-ms 15000

cmux browser surface:7 snapshot --interactive

cmux browser surface:7 fill e1 "hello"

cmux --json browser surface:7 click e2 --snapshot-after

cmux browser surface:7 snapshot --interactive

## Surface Targeting

identify current context

cmux identify --json

open routed to a specific topology target

cmux browser open https://example.com --workspace workspace:2 --window window:1 --json


Notes:

- CLI output defaults to short refs (`surface:N`, `pane:N`, `workspace:N`, `window:N`).

- UUIDs are still accepted on input; only request UUID output when needed (`--id-format uuids|both`).

- Keep using one `surface:N` per task unless you intentionally switch.

## Wait Support

cmux supports wait patterns similar to agent-browser:

cmux browser <surface> wait --selector "#ready" --timeout-ms 10000

cmux browser <surface> wait --text "Success" --timeout-ms 10000

cmux browser <surface> wait --url-contains "/dashboard" --timeout-ms 10000

cmux browser <surface> wait --load-state complete --timeout-ms 15000

cmux browser <surface> wait --function "document.readyState === 'complete'" --timeout-ms 10000


## Common Flows

### Form Submit

cmux --json browser open https://example.com/signup

cmux browser surface:7 get url

cmux browser surface:7 wait --load-state complete --timeout-ms 15000

cmux browser surface:7 snapshot --interactive

cmux browser surface:7 fill e1 "Jane Doe"

cmux browser surface:7 fill e2 "jane@example.com"

cmux --json browser surface:7 click e3 --snapshot-after

cmux browser surface:7 wait --url-contains "/welcome" --timeout-ms 15000

cmux browser surface:7 snapshot --interactive


### Clear an Input

cmux browser surface:7 fill e11 "" --snapshot-after --json

cmux browser surface:7 get value e11 --json


### Stable Agent Loop (Recommended)

navigate -> verify -> wait -> snapshot -> action -> snapshot

cmux browser surface:7 get url

cmux browser surface:7 wait --load-state complete --timeout-ms 15000

cmux browser surface:7 snapshot --interactive

cmux --json browser surface:7 click e5 --snapshot-after

cmux browser surface:7 snapshot --interactive


If `get url` is empty or `about:blank`, navigate first instead of waiting on load state.

## Deep-Dive References

Reference
When to Use

[references/commands.md](https://github.com/manaflow-ai/cmux/blob/HEAD/skills/cmux-browser/references/commands.md)
Full browser command mapping and quick syntax

[references/snapshot-refs.md](https://github.com/manaflow-ai/cmux/blob/HEAD/skills/cmux-browser/references/snapshot-refs.md)
Ref lifecycle and stale-ref troubleshooting

[references/authentication.md](https://github.com/manaflow-ai/cmux/blob/HEAD/skills/cmux-browser/references/authentication.md)
Login/OAuth/2FA patterns and state save/load

[references/authentication.md#saving-authentication-state](https://github.com/manaflow-ai/cmux/blob/HEAD/skills/cmux-browser/references/authentication.md#saving-authentication-state)
Save authenticated state right after login

[references/session-management.md](https://github.com/manaflow-ai/cmux/blob/HEAD/skills/cmux-browser/references/session-management.md)
Multi-surface isolation and state persistence patterns

[references/video-recording.md](https://github.com/manaflow-ai/cmux/blob/HEAD/skills/cmux-browser/references/video-recording.md)
Current recording status and practical alternatives

[references/proxy-support.md](https://github.com/manaflow-ai/cmux/blob/HEAD/skills/cmux-browser/references/proxy-support.md)
Proxy behavior in WKWebView and workarounds

## Ready-to-Use Templates

Template
Description

[templates/form-automation.sh](https://github.com/manaflow-ai/cmux/blob/HEAD/skills/cmux-browser/templates/form-automation.sh)
Snapshot/ref form fill loop

[templates/authenticated-session.sh](https://github.com/manaflow-ai/cmux/blob/HEAD/skills/cmux-browser/templates/authenticated-session.sh)
Login once, save/load state

[templates/capture-workflow.sh](https://github.com/manaflow-ai/cmux/blob/HEAD/skills/cmux-browser/templates/capture-workflow.sh)
Navigate + capture snapshots/screenshots

## Limits (WKWebView)

These commands currently return `not_supported` because they rely on Chrome/CDP-only APIs not exposed by WKWebView:

- viewport emulation

- offline emulation

- trace/screencast recording

- network route interception/mocking

- low-level raw input injection

Use supported high-level commands (`click`, `fill`, `press`, `scroll`, `wait`, `snapshot`) instead.

## Troubleshooting

### js_error on snapshot --interactive or eval

Some complex pages can reject or break the JavaScript used for rich snapshots and ad-hoc evaluation.

Recovery steps:

cmux browser surface:7 get url

cmux browser surface:7 get text body

cmux browser surface:7 get html body

BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card