Browser Automation
Launch a browser, interact with any web app, and verify UI state — all driven by your AI coding agent.
Quick Start
The basic workflow is: launch → inspect → interact → verify → close. Use /verify to kick it off:
/verify that the Sign Up flow works at http://localhost:3000Your agent handles the rest — it launches a browser, takes screenshots, reads the DOM, performs actions, and reports back.
Slash commands
Shiplight provides slash commands that guide your AI agent through common workflows:
/verify— Open a browser and verify UI changes/create-tests— Walk through your app and create reusable YAML E2E test cases/triage— Reproduce failing tests, diagnose root causes, and fix YAML E2E tests
Type these directly in your AI coding agent (Claude Code, Cursor, etc.) to trigger the corresponding workflow.
Use Cases
Verify UI Changes
The most common use case. After implementing a feature or fixing a bug, ask your agent to check it in a real browser.
/verify I just added a dark mode toggle to the settings page.
Open http://localhost:3000/settings, click the toggle,
and verify the page switches to dark mode.To get a video recording of the verification for review:
/verify the checkout flow works end-to-end.
Record the session so I can watch it.Your agent records the session and gives you a video file and a Playwright trace when done.
Test with Authentication
For apps that require login, your agent logs in once and saves the session. Future sessions restore it automatically — no login flow needed.
First time — login and save:
/verify http://localhost:3000/dashboard — log in with
test@example.com / password123Subsequent sessions — the agent restores the saved session automatically:
/verify http://localhost:3000/dashboard loads correctlyUse a Persistent Chrome Profile
Instead of starting fresh every time, you can point your agent at a Chrome profile directory. Everything persists between sessions — login state, cookies, bookmarks, extensions, localStorage.
First time — create a profile and log in:
/verify https://app.example.com — use the Chrome profile
at ./my-chrome-profile. I'll log in manually.Log in once in the browser window. The profile saves everything.
Subsequent sessions — your login is already there:
/verify https://app.example.com/dashboard —
use the Chrome profile at ./my-chrome-profileNo credentials, no storage state files, no setup — just point to the profile directory and go. This works great for:
- Apps with Google OAuth / SSO — log in once manually, reuse forever
- Complex authenticated state — multi-step onboarding, 2FA, org switching
- Extension testing — combine with
path_to_extensionto test extensions with real login state
Without a Chrome profile, a fresh temp profile is created and cleaned up when the session closes.
Test Chrome Extensions
Test an unpacked Chrome extension by loading it into the browser at launch.
/verify my Chrome extension at ./my-extension works —
open https://example.com and check that the extension
injected its UI into the page.Your agent launches Chromium in headed mode with the extension loaded. It can see and interact with any UI the extension injects into the page — content scripts, banners, sidebars, or modified DOM elements.
Combine with a persistent Chrome profile to test extensions with authenticated state:
/verify my extension at ./my-extension works on the dashboard —
use the Chrome profile at ./my-chrome-profile and open
https://app.example.com/dashboardTIP
Screenshots and video recordings capture the page content only — the browser toolbar (with extension icons) is not included. Verify your extension works by checking its effects: DOM changes, console logs, network requests, or storage state.
Test on an Existing Browser
If you need to control a browser that's already open — with specific tabs, complex state, or extensions you've configured manually — you can attach to it instead of launching a new one.
Simpler alternative
For most cases, a persistent Chrome profile is simpler and doesn't require any setup. Use the relay approach below only when you need to interact with a browser that's already running.
This uses the Shiplight Chrome Extension as a relay between your agent and your browser. See Attach to Existing Browser for setup — you'll need to install the extension and enable the relay server. A direct CDP connection (no extension) is also supported.
Attach to my browser and check if the payment form
on the current page has any console errors.Your agent auto-discovers tabs via the extension relay — no URL or configuration needed.
Test Responsive and Mobile Layouts
Test how your app looks on different screen sizes and devices.
/verify http://localhost:3000 on mobile (390x844 viewport, touch enabled) —
check that the hamburger menu appears instead of the desktop navigation bar.Available emulation options:
| Option | Example | Description |
|---|---|---|
viewport | 390x844, 768x1024 | Screen dimensions |
is_mobile | — | Mobile CSS media queries, meta viewport |
has_touch | — | Touch events |
user_agent | Custom UA string | Override the browser user agent |
color_scheme | dark, light | Emulated color scheme |
locale | ja-JP, fr-FR | Browser locale |
timezone_id | America/New_York | Emulated timezone |
geolocation | lat/lng coordinates | Emulated GPS location |
Debug Failures
When something goes wrong, your agent can inspect console errors and network requests to diagnose issues.
/verify http://localhost:3000/dashboard — click "Load Data"
and check if there are any console errors or failed network requests./verify the submit button on the form page works.
Try submitting and show me any JavaScript errors and failed API calls.Your agent checks console errors and failed network requests, and reports what it finds.
Performing Actions
Your agent can interact with any element on the page:
- Click, type, select — "Click the Sign In button", "Type 'hello@example.com' in the email field", "Select 'Monthly' from the billing dropdown"
- Scroll, navigate — "Scroll down to the pricing section", "Go back to the previous page"
- Files — "Upload the file at /tmp/report.pdf"
- Keyboard — "Press Enter to submit the form", "Press Escape to close the modal"
- Tabs — "Switch to the second tab", "Close the current tab"
All browser actions are deterministic — no LLM API keys are needed for the MCP server. Your coding agent (Claude, Cursor, etc.) handles the reasoning; Shiplight just executes the actions.
For the full list of MCP tools, environment variables, and how the server is configured, see MCP Server.