actionbook

Pre-verified page actions and selectors for website automation without runtime discovery. Search a library of documented page interactions by task intent, then retrieve structured DOM details with tested CSS selectors ready for browser commands Browser commands cover navigation, form filling, clicking, text extraction, screenshots, and waiting for page changes Handles login walls by pausing automation and asking the user to complete authentication manually in the same session Daemon mode (Unix/CDP) maintains a persistent WebSocket connection per profile, eliminating per-command connection overhead Falls back to live accessibility tree snapshots when selectors become outdated due to website changes

INSTALLATION
npx skills add https://github.com/actionbook/actionbook --skill actionbook
Run in your project or agent environment. Adjust flags if your CLI version differs.

SKILL.md

$2a

The workflow:

  • Start a browser session
  • Navigate to the target page
  • Snapshot to get the page structure with element refs
  • Automate using refs from the snapshot

Run actionbook <command> --help for full usage and examples of any command.

Browser Automation

Every browser command is stateless — pass --session and --tab explicitly. No "current tab" — you can run commands on any session/tab in parallel.

Start a session

actionbook browser start --set-session-id s1

Both --session and --set-session-id are get-or-create: they reuse a Running session with the given ID, or create one if not found. If --profile is passed and does not match the session's bound profile, the command fails with SESSION_PROFILE_MISMATCH.

Core workflow: snapshot, act, wait

actionbook browser goto <url> --session s1 --tab t1

actionbook browser snapshot --session s1 --tab t1          # Get page structure with refs

actionbook browser fill @e3 "text" --session s1 --tab t1   # Use refs from snapshot

actionbook browser click @e7 --session s1 --tab t1

actionbook browser wait navigation --session s1 --tab t1   # Wait for page load

Snapshot refs

snapshot labels every element with a ref (e.g. @e3, @e7). Use these refs as selectors in any command — they are the recommended way to target elements.

Refs are stable across snapshots — if the element stays the same, the ref stays the same. This lets you chain multiple commands without re-snapshotting after every step.

Command categories

All commands support --help for full usage and examples.

Category

Key commands

Help

Search

search

actionbook search --help

Manual

manual (alias: man)

actionbook manual --help

Session

start, close, restart, list-sessions, status

actionbook browser start --help

Tab

new-tab, close-tab, list-tabs

actionbook browser new-tab --help

Navigation

goto, back, forward, reload

actionbook browser goto --help

Observation

snapshot, text, html, value, title, url, viewport, attr, attrs, box, styles, describe, state, inspect-point, screenshot, pdf

actionbook browser snapshot --help

Interaction

click, fill, type, press, select, hover, focus, scroll, drag, upload, eval, mouse-move, cursor-position

actionbook browser click --help

Wait

wait element, wait navigation, wait network-idle, wait condition

actionbook browser wait element --help

Cookies

cookies list, cookies get, cookies set, cookies delete, cookies clear

actionbook browser cookies list --help

Storage

local-storage list|get|set|delete|clear, session-storage ...

actionbook browser local-storage get --help

Logs

logs console, logs errors

actionbook browser logs console --help

Network

network requests, network request <id>, network har start, network har stop

actionbook browser network requests --help

Query

query one|all|nth|count

actionbook browser query --help

Batch

batch-new-tab, batch-snapshot, batch-click

actionbook browser batch-new-tab --help

Extension

extension status, extension ping, extension install, extension uninstall, extension path

actionbook extension status --help

Daemon

daemon restart

actionbook daemon restart --help

Full command reference: command-reference.md

Cloud providers

Use -p / --provider with browser start to run sessions on a remote browser instead of launching local Chrome. Supported providers: driver, hyperbrowser, browseruse. Each reads its own <PROVIDER>_API_KEY from the shell env.

export HYPERBROWSER_API_KEY="your-key"

actionbook browser start -p hyperbrowser --session s1

actionbook browser goto "https://example.com" --session s1 --tab t1

actionbook browser snapshot --session s1 --tab t1

All browser commands work the same way regardless of mode. browser restart --session <id> mints a fresh remote session while preserving the session_id.

Example: End-to-End

User request: "Find a room next week in SF on Airbnb"

actionbook browser start --set-session-id s1

actionbook browser goto "https://airbnb.com" --session s1 --tab t1

actionbook browser snapshot --session s1 --tab t1

actionbook browser fill @e3 "San Francisco" --session s1 --tab t1

actionbook browser click @e7 --session s1 --tab t1

actionbook browser wait navigation --session s1 --tab t1

Eval Input Sources

browser eval accepts the expression from three mutually-exclusive sources:

  • Positional: actionbook browser eval "expr" ...
  • **--file**: actionbook browser eval --file script.js ...
  • Stdin: echo 'expr' | actionbook browser eval - ...

Eval Error Handling

browser eval returns structured error codes on failure — branch on error.code instead of parsing the message:

  • EVAL_RUNTIME_ERROR — JS exception. Inspect the expression before retrying.
  • EVAL_CROSS_ORIGIN — cross-origin fetch or CSP block. Proxy the request server-side.
  • EVAL_RESPONSE_NOT_JSON / EVAL_RESPONSE_NOT_OK — read error.details.body_head (first ≤256 chars of the response body) to distinguish 403 / challenge pages / CORS errors. Do not blindly retry.
  • EVAL_TIMEOUT — expression exceeded --timeout. Reduce work or raise the timeout.
  • EVAL_ARGS_CONFLICT — multiple input sources or none. Provide exactly one.
  • EVAL_FILE_NOT_FOUND--file path unreadable. Verify the path.
  • EVAL_STDIN_TTY- but stdin is a terminal. Pipe the expression.
  • EVAL_STDIN_EMPTY — stdin produced empty input. Verify the upstream pipeline.

CDP Error Handling

Browser commands that interact with elements, navigate, or communicate via CDP return structured error codes — branch on error.code:

  • CDP_NODE_NOT_FOUND — DOM node is stale. Call snapshot to refresh refs then retry.
  • CDP_NOT_INTERACTABLE — element exists but can't be acted on. Scroll into view, wait for visibility, or dismiss overlays.
  • CDP_NAV_TIMEOUT — navigation timeout. Increase --timeout or verify URL reachability. Retryable.
  • CDP_TARGET_CLOSED — tab navigated away or session torn down mid-command. Start a fresh session. Retryable.
  • CDP_PROTOCOL_ERROR — CDP response malformed. Inspect details.reason and details.cdp_code.
  • CDP_GENERIC — unclassified CDP error (transport/parse). No specific remediation.

CDP_NAV_TIMEOUT and CDP_TARGET_CLOSED are retryable (error.retryable == true). All other CDP codes require caller intervention before retrying. When error.code is a CDP_* code, error.details includes reason and cdp_code when available.

Selectors

Selectors should come from actionbook browser snapshot — not from prior knowledge or memory. Always snapshot first to get current refs, then use those refs to interact with the page.

Login Page Handling

When you hit a login/auth wall (sign-in page, password prompt, MFA/OTP, CAPTCHA, account chooser):

  • Pause automation and keep the current browser session open (same tab/profile/cookies).
  • Ask the user to complete login manually in that same browser window.
  • After user confirms login is done, continue in the same session.
  • If the post-login page is different, run actionbook browser snapshot to get the new page structure before continuing.

Do not switch tools just because a login page appears.

Session Cleanup

browser close is idempotent — closing an unknown or already-closed session returns ok: true with a warning in meta.warnings, not a fatal error. A typo in the session ID or a session that was already torn down is no longer an error condition.

  • Safe to call browser close unconditionally during cleanup without checking session existence first.
  • Read meta.warnings to distinguish a fresh close from an already-gone session. Do not treat a warning inside an ok: true response as a signal that the session is still alive.
  • If another close is already in flight for the same session, the command returns SESSION_CLOSING (fatal).

HAR Recording

network har start accepts --max-entries N to set the ring-buffer cap (default: 10000). When har stop detects dropped entries (data.dropped > 0), the envelope includes meta.truncated = true and a HAR_TRUNCATED warning in meta.warnings. Read data.max_entries to see the configured cap. Raise --max-entries or stop recording sooner to keep the full trace.

References

Reference

Description

command-reference.md

Complete command reference with all flags and options

authentication.md

Login flows, OAuth, 2FA handling, session persistence

BrowserAct

Let your agent run on any real-world website

Bypass CAPTCHA & anti-bot for free. Start local, scale to cloud.

Explore BrowserAct Skills →

Stop writing automation&scrapers

Install the CLI. Run your first Skill in 30 seconds. Scale when you're ready.

Start free
free · no credit card