Skip to content

Browser Automation

Launch a browser, interact with any web app, and verify UI state — all driven by your AI coding agent.

Quick Start

The basic workflow is: launch → inspect → interact → verify → close. Use /verify to kick it off:

/verify that the Sign Up flow works at http://localhost:3000

Your agent handles the rest — it launches a browser, takes screenshots, reads the DOM, performs actions, and reports back.

Slash commands

Shiplight provides slash commands that guide your AI agent through common workflows:

  • /verify — Open a browser and verify UI changes
  • /create_e2e_tests — Walk through your app and create reusable YAML test cases
  • /cloud — Sync tests, manage runs, and work with Shiplight Cloud

Type these directly in your AI coding agent (Claude Code, Cursor, etc.) to trigger the corresponding workflow.

Use Cases

Verify UI Changes

The most common use case. After implementing a feature or fixing a bug, ask your agent to check it in a real browser.

/verify I just added a dark mode toggle to the settings page.
Open http://localhost:3000/settings, click the toggle,
and verify the page switches to dark mode.

To get a video recording of the verification for review:

/verify the checkout flow works end-to-end.
Record the session so I can watch it.

Your agent records the session and gives you a video file and a Playwright trace when done.

Test with Authentication

For apps that require login, your agent logs in once and saves the session. Future sessions restore it automatically — no login flow needed.

First time — login and save:

/verify http://localhost:3000/dashboard — log in with
test@example.com / password123

Subsequent sessions — the agent restores the saved session automatically:

/verify http://localhost:3000/dashboard loads correctly

Use a Persistent Chrome Profile

Instead of starting fresh every time, you can point your agent at a Chrome profile directory. Everything persists between sessions — login state, cookies, bookmarks, extensions, localStorage.

First time — create a profile and log in:

/verify https://app.example.com — use the Chrome profile
at ./my-chrome-profile. I'll log in manually.

Log in once in the browser window. The profile saves everything.

Subsequent sessions — your login is already there:

/verify https://app.example.com/dashboard —
use the Chrome profile at ./my-chrome-profile

No credentials, no storage state files, no setup — just point to the profile directory and go. This works great for:

  • Apps with Google OAuth / SSO — log in once manually, reuse forever
  • Complex authenticated state — multi-step onboarding, 2FA, org switching
  • Extension testing — combine with path_to_extension to test extensions with real login state

Without a Chrome profile, a fresh temp profile is created and cleaned up when the session closes.

Test Chrome Extensions

Test an unpacked Chrome extension by loading it into the browser at launch.

/verify my Chrome extension at ./my-extension works —
open https://example.com and check that the extension
injected its UI into the page.

Your agent launches Chromium in headed mode with the extension loaded. It can see and interact with any UI the extension injects into the page — content scripts, banners, sidebars, or modified DOM elements.

Combine with a persistent Chrome profile to test extensions with authenticated state:

/verify my extension at ./my-extension works on the dashboard —
use the Chrome profile at ./my-chrome-profile and open
https://app.example.com/dashboard

TIP

Screenshots and video recordings capture the page content only — the browser toolbar (with extension icons) is not included. Verify your extension works by checking its effects: DOM changes, console logs, network requests, or storage state.

Test on an Existing Browser

If you need to control a browser that's already open — with specific tabs, complex state, or extensions you've configured manually — you can attach to it instead of launching a new one.

Simpler alternative

For most cases, a persistent Chrome profile is simpler and doesn't require any setup. Use the relay approach below only when you need to interact with a browser that's already running.

This uses the Shiplight Chrome Extension as a relay between your agent and your browser. See setup instructions below — you'll need to install the extension and enable the relay server.

Attach to my browser and check if the payment form
on the current page has any console errors.

Your agent auto-discovers tabs via the extension relay — no URL or configuration needed.

See Attach to Existing Browser below for setup instructions.

Direct CDP connection

You can also connect directly via Chrome DevTools Protocol without the extension. Start Chrome with --remote-debugging-port=9222, then:

Attach to my browser at ws://localhost:9222/devtools/browser/...

Test Responsive and Mobile Layouts

Test how your app looks on different screen sizes and devices.

/verify http://localhost:3000 on mobile (390x844 viewport, touch enabled) —
check that the hamburger menu appears instead of the desktop navigation bar.

Available emulation options:

OptionExampleDescription
viewport390x844, 768x1024Screen dimensions
is_mobileMobile CSS media queries, meta viewport
has_touchTouch events
user_agentCustom UA stringOverride the browser user agent
color_schemedark, lightEmulated color scheme
localeja-JP, fr-FRBrowser locale
timezone_idAmerica/New_YorkEmulated timezone
geolocationlat/lng coordinatesEmulated GPS location

Debug Failures

When something goes wrong, your agent can inspect console errors and network requests to diagnose issues.

/verify http://localhost:3000/dashboard — click "Load Data"
and check if there are any console errors or failed network requests.
/verify the submit button on the form page works.
Try submitting and show me any JavaScript errors and failed API calls.

Your agent checks console errors and failed network requests, and reports what it finds.

Performing Actions

Your agent can interact with any element on the page:

  • Click, type, select — "Click the Sign In button", "Type 'hello@example.com' in the email field", "Select 'Monthly' from the billing dropdown"
  • Scroll, navigate — "Scroll down to the pricing section", "Go back to the previous page"
  • Files — "Upload the file at /tmp/report.pdf"
  • Keyboard — "Press Enter to submit the form", "Press Escape to close the modal"
  • Tabs — "Switch to the second tab", "Close the current tab"

All browser actions are deterministic — no LLM API keys are needed for the MCP server. Your coding agent (Claude, Cursor, etc.) handles the reasoning; Shiplight just executes the actions.

Tool Reference

Session Management

ToolDescription
new_sessionLaunch a browser with optional emulation and extensions
close_sessionClose a session (returns video/trace paths if recording)
close_allClose all sessions
get_session_stateGet current URL and session type
save_storage_stateSave cookies/localStorage for fast session restore
attach_to_browserConnect to an existing browser via relay or CDP

Page Inspection

ToolDescription
navigateNavigate to a URL
get_page_infoGet current page URL and title (lightweight, no screenshot)
inspect_pageDOM tree + Set-of-Mark screenshot (element indices for the act tool)
get_locatorsExtract Playwright locator/xpath for elements

Actions

ToolDescription
actExecute browser actions (click, type, scroll, etc.) using element indices from inspect_page

Debugging

ToolDescription
get_browser_console_logsGet browser console output with filtering
get_browser_network_logsGet network requests with status filtering
clear_logsClear console and network logs

Reporting

ToolDescription
generate_html_reportGenerate a self-contained HTML report with screenshots, video, and checklist
upload_html_reportUpload an HTML report to Shiplight Cloud for sharing

Configuration

All configuration is done through environment variables. Set them in your project's .env file (auto-discovered by the MCP server on startup), in the MCP server config's env block, or export them in your shell.

Environment Variables

VariableRequiredDescriptionDefault
SHIPLIGHT_API_TOKENFor cloud featuresShiplight API token — enables cloud sync tools. Get your token from app.shiplight.ai/settings/api-tokens.

Attach to Existing Browser (Chrome Extension)

By default, new_session launches a fresh Chromium browser. If you want your coding agent to interact with a browser you already have open — with existing login state, cookies, tabs, etc. — use the Shiplight Chrome extension.

Setup

1. Install the Chrome extension

The extension is bundled with the MCP server. To get the path:

bash
npx @shiplightai/mcp --chrome-extension-path

Then load it in Chrome:

  • Open chrome://extensions
  • Enable Developer mode (toggle in top right)
  • Click Load unpacked and select the path from above

2. Enable the relay server

Add SHIPLIGHT_RELAY_PORT to your project's .env:

SHIPLIGHT_RELAY_PORT=15170

Reconnect the MCP server (e.g. /mcp in Claude Code) — the relay server starts automatically on the specified port.

3. Configure the extension

  • Right-click the Shiplight extension icon → Options
  • Set Relay Server Port to 15170 (must match your .env)
  • Click Save Settings

4. Attach a tab

Navigate to the page you want to control, then click the Shiplight extension icon. The badge shows ON (orange) when the tab is attached and connected.

Badge indicators:

  • ON (orange) — tab attached and connected
  • (yellow) — reconnecting
  • ! (red) — error, check extension options

Released under the MIT License.