agent-browser.dev is a command-line tool for headless browser automation designed with AI agents in mind. The project combines a fast Rust CLI with a Node.js daemon that runs Playwright, providing a deterministic, agent-friendly workflow for navigation, interaction, and site inspection.
What it is
agent-browser exposes a wide set of browser controls as simple CLI commands: navigation (open), element interaction (click, fill, type), state checks (is visible, is checked), page capture (screenshot, pdf), and an accessibility-driven snapshot that produces refs for deterministic element selection. The repo is licensed under Apache-2.0 and currently shows 3.2k stars and 116 forks on GitHub.
Installation and platforms
Installation is available via npm install -g agent-browser, with an agent-browser install step to download Chromium. Source builds are supported (pnpm + Rust toolchain required for the native binary). On Linux, an --with-deps option installs system dependencies, or Playwright's npx playwright install-deps chromium can be used.
Supported platforms provide a native Rust binary with a Node.js fallback:
- macOS ARM64 / x64: Native Rust (fallback: Node.js)
- Linux ARM64 / x64: Native Rust (fallback: Node.js)
- Windows x64: Native Rust (fallback: Node.js)
Core commands and workflows
Commands are designed for concise scripting and agent consumption. Common commands include:
agent-browser open <url>(aliases:goto,navigate)agent-browser click <sel>,fill <sel> <text>,type <sel> <text>agent-browser screenshot [path](--fullfor full page)agent-browser snapshot— returns an accessibility tree and refs (e.g.,@e1)
The snapshot + refs workflow is emphasized as deterministic, fast, and AI-friendly: an agent obtains a snapshot, selects refs, then performs actions by referencing those refs (e.g., agent-browser click @e2).
Semantic locators and selectors
Beyond refs, agent-browser supports traditional CSS selectors, XPath/text selectors, and semantic locators (ARIA role, labels, placeholders, alt/title/testid). Examples of semantic commands:
agent-browser find role button click --name "Submit"agent-browser find label "Email" fill "test@test.com"
Agent integration and machine-readable output
A --json option provides machine-readable output for integration with LLM-based agents:
agent-browser snapshot --jsonreturns a structured snapshot and refsagent-browser get text @e1 --jsonand other commands support JSON output
The repository includes a skills/agent-browser skill and instructions for integrating with Claude Code via a provided skill file.
Sessions, authentication, and network control
agent-browser supports isolated sessions (--session or AGENT_BROWSER_SESSION), where each session maintains its own browser instance, cookies, storage, history, and auth state. Scoped headers can be set per origin to enable authenticated sessions without UI logins:
agent-browser open api.example.com --headers '{"Authorization":"Bearer <token>"}'
Network control includes request routing and mocking:
agent-browser network route <url> --abortagent-browser network route <url> --body <json>
Advanced usage: CDP, custom executables, and headed mode
- CDP mode (
--cdp <port>) connects to an existing Chrome DevTools Protocol endpoint (Electron, remote Chrome, WebView2). - Custom browser executables can be supplied via
--executable-pathorAGENT_BROWSER_EXECUTABLE_PATH, useful for serverless deployments with lightweight Chromium builds. --headedshows the browser window for debugging.
Architecture and runtime behavior
agent-browser follows a client-daemon architecture:
- Rust CLI parses commands and communicates with the daemon.
- Node.js daemon manages the Playwright browser instance.
- Fallback: If the native binary is not available, a Node.js-only path runs the CLI logic.
Chromium is used by default, and Playwright-driven support exists for Firefox and WebKit.
Quick start examples
A minimal interaction flow shown in the docs:
agent-browser open example.comagent-browser snapshot -i(interactive elements only)agent-browser click @e2agent-browser fill @e3 "test@example.com"agent-browser screenshot page.pngagent-browser close
Repository notes
The project is primarily written in TypeScript and Rust (approximately 58% TypeScript, 39% Rust), and includes sample skills, docs, and a suite of commands and options for debugging, tracing, and state management.
Original source: https://github.com/vercel-labs/agent-browser



