# Browser Automation

<div class="view-markdown-wrapper">
<ViewMarkdown />
</div>

Launch a browser, interact with any web app, and verify UI state — all driven by your AI coding agent.

## Quick Start

The basic workflow is: **launch → inspect → interact → verify → close**. Use `/verify` to kick it off:

```
/verify that the Sign Up flow works at http://localhost:3000
```

Your agent handles the rest — it launches a browser, takes screenshots, reads the DOM, performs actions, and reports back.

::: tip Slash commands
Shiplight provides slash commands that guide your AI agent through common workflows:

- **`/verify`** — Open a browser and verify UI changes
- **`/create-tests`** — Walk through your app and create reusable YAML test cases
- **`/triage`** — Reproduce failing tests, diagnose root causes, and fix YAML tests
- **`/cloud`** — Sync tests, manage runs, and work with Shiplight Cloud

Type these directly in your AI coding agent (Claude Code, Cursor, etc.) to trigger the corresponding workflow.
:::

## Use Cases

### Verify UI Changes

The most common use case. After implementing a feature or fixing a bug, ask your agent to check it in a real browser.

```
/verify I just added a dark mode toggle to the settings page.
Open http://localhost:3000/settings, click the toggle,
and verify the page switches to dark mode.
```

To get a **video recording** of the verification for review:

```
/verify the checkout flow works end-to-end.
Record the session so I can watch it.
```

Your agent records the session and gives you a video file and a Playwright trace when done.

### Test with Authentication

For apps that require login, your agent logs in once and saves the session. Future sessions restore it automatically — no login flow needed.

**First time — login and save:**

```
/verify http://localhost:3000/dashboard — log in with
test@example.com / password123
```

**Subsequent sessions — the agent restores the saved session automatically:**

```
/verify http://localhost:3000/dashboard loads correctly
```

### Use a Persistent Chrome Profile

Instead of starting fresh every time, you can point your agent at a Chrome profile directory. Everything persists between sessions — login state, cookies, bookmarks, extensions, localStorage.

**First time — create a profile and log in:**

```
/verify https://app.example.com — use the Chrome profile
at ./my-chrome-profile. I'll log in manually.
```

Log in once in the browser window. The profile saves everything.

**Subsequent sessions — your login is already there:**

```
/verify https://app.example.com/dashboard —
use the Chrome profile at ./my-chrome-profile
```

No credentials, no storage state files, no setup — just point to the profile directory and go. This works great for:

- **Apps with Google OAuth / SSO** — log in once manually, reuse forever
- **Complex authenticated state** — multi-step onboarding, 2FA, org switching
- **Extension testing** — combine with `path_to_extension` to test extensions with real login state

Without a Chrome profile, a fresh temp profile is created and cleaned up when the session closes.

### Test Chrome Extensions

Test an unpacked Chrome extension by loading it into the browser at launch.

```
/verify my Chrome extension at ./my-extension works —
open https://example.com and check that the extension
injected its UI into the page.
```

Your agent launches Chromium in headed mode with the extension loaded. It can see and interact with any UI the extension injects into the page — content scripts, banners, sidebars, or modified DOM elements.

Combine with a persistent Chrome profile to test extensions with authenticated state:

```
/verify my extension at ./my-extension works on the dashboard —
use the Chrome profile at ./my-chrome-profile and open
https://app.example.com/dashboard
```

::: tip
Screenshots and video recordings capture the page content only — the browser toolbar (with extension icons) is not included. Verify your extension works by checking its effects: DOM changes, console logs, network requests, or storage state.
:::

### Test on an Existing Browser

If you need to control a browser that's already open — with specific tabs, complex state, or extensions you've configured manually — you can attach to it instead of launching a new one.

::: tip Simpler alternative
For most cases, a [persistent Chrome profile](#use-a-persistent-chrome-profile) is simpler and doesn't require any setup. Use the relay approach below only when you need to interact with a browser that's already running.
:::

This uses the **Shiplight Chrome Extension** as a relay between your agent and your browser. See [setup instructions](#attach-to-existing-browser-chrome-extension) below — you'll need to install the extension and enable the relay server.

```
Attach to my browser and check if the payment form
on the current page has any console errors.
```

Your agent auto-discovers tabs via the extension relay — no URL or configuration needed.

See [Attach to Existing Browser](#attach-to-existing-browser-chrome-extension) below for setup instructions.

::: tip Direct CDP connection
You can also connect directly via Chrome DevTools Protocol without the extension. Start Chrome with `--remote-debugging-port=9222`, then:

```
Attach to my browser at ws://localhost:9222/devtools/browser/...
```

:::

### Test Responsive and Mobile Layouts

Test how your app looks on different screen sizes and devices.

```
/verify http://localhost:3000 on mobile (390x844 viewport, touch enabled) —
check that the hamburger menu appears instead of the desktop navigation bar.
```

Available emulation options:

| Option         | Example               | Description                             |
| -------------- | --------------------- | --------------------------------------- |
| `viewport`     | `390x844`, `768x1024` | Screen dimensions                       |
| `is_mobile`    | —                     | Mobile CSS media queries, meta viewport |
| `has_touch`    | —                     | Touch events                            |
| `user_agent`   | Custom UA string      | Override the browser user agent         |
| `color_scheme` | `dark`, `light`       | Emulated color scheme                   |
| `locale`       | `ja-JP`, `fr-FR`      | Browser locale                          |
| `timezone_id`  | `America/New_York`    | Emulated timezone                       |
| `geolocation`  | lat/lng coordinates   | Emulated GPS location                   |

### Debug Failures

When something goes wrong, your agent can inspect console errors and network requests to diagnose issues.

```
/verify http://localhost:3000/dashboard — click "Load Data"
and check if there are any console errors or failed network requests.
```

```
/verify the submit button on the form page works.
Try submitting and show me any JavaScript errors and failed API calls.
```

Your agent checks console errors and failed network requests, and reports what it finds.

## Performing Actions

Your agent can interact with any element on the page:

- **Click, type, select** — "Click the Sign In button", "Type 'hello@example.com' in the email field", "Select 'Monthly' from the billing dropdown"
- **Scroll, navigate** — "Scroll down to the pricing section", "Go back to the previous page"
- **Files** — "Upload the file at /tmp/report.pdf"
- **Keyboard** — "Press Enter to submit the form", "Press Escape to close the modal"
- **Tabs** — "Switch to the second tab", "Close the current tab"

All browser actions are deterministic — no LLM API keys are needed for the MCP server. Your coding agent (Claude, Cursor, etc.) handles the reasoning; Shiplight just executes the actions.

## Tool Reference

### Session Management

| Tool                 | Description                                              |
| -------------------- | -------------------------------------------------------- |
| `new_session`        | Launch a browser with optional emulation and extensions  |
| `close_session`      | Close a session (returns video/trace paths if recording) |
| `close_all`          | Close all sessions                                       |
| `get_session_state`  | Get current URL and session type                         |
| `save_storage_state` | Save cookies/localStorage for fast session restore       |
| `attach_to_browser`  | Connect to an existing browser via relay or CDP          |

### Page Inspection

| Tool            | Description                                                            |
| --------------- | ---------------------------------------------------------------------- |
| `navigate`      | Navigate to a URL                                                      |
| `get_page_info` | Get current page URL and title (lightweight, no screenshot)            |
| `inspect_page`  | DOM tree + Set-of-Mark screenshot (element indices for the `act` tool) |
| `get_locators`  | Extract Playwright locator/xpath for elements                          |

### Actions

| Tool  | Description                                                                                   |
| ----- | --------------------------------------------------------------------------------------------- |
| `act` | Execute browser actions (click, type, scroll, etc.) using element indices from `inspect_page` |

### Debugging

| Tool                       | Description                                |
| -------------------------- | ------------------------------------------ |
| `get_browser_console_logs` | Get browser console output with filtering  |
| `get_browser_network_logs` | Get network requests with status filtering |
| `clear_logs`               | Clear console and network logs             |

### Reporting

| Tool                   | Description                                                                  |
| ---------------------- | ---------------------------------------------------------------------------- |
| `generate_html_report` | Generate a self-contained HTML report with screenshots, video, and checklist |
| `upload_html_report`   | Upload an HTML report to Shiplight Cloud for sharing                         |

## Configuration

All configuration is done through environment variables. Set them in your project's `.env` file (auto-discovered by the MCP server on startup), in the MCP server config's `env` block, or export them in your shell.

### Environment Variables

| Variable              | Required           | Description                                                                                                                                               | Default |
| --------------------- | ------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
| `SHIPLIGHT_API_TOKEN` | For cloud features | Shiplight API token — enables cloud sync tools. Get your token from [app.shiplight.ai/settings/api-tokens](https://app.shiplight.ai/settings/api-tokens). | —       |

### Attach to Existing Browser (Chrome Extension)

By default, `new_session` launches a fresh Chromium browser. If you want your coding agent to interact with a browser you already have open — with existing login state, cookies, tabs, etc. — use the Shiplight Chrome extension.

#### Setup

**1. Install the Chrome extension**

The extension is bundled with the MCP server. To get the path:

```bash
npx @shiplightai/mcp --chrome-extension-path
```

Then load it in Chrome:

- Open `chrome://extensions`
- Enable **Developer mode** (toggle in top right)
- Click **Load unpacked** and select the path from above

**2. Enable the relay server**

Add `SHIPLIGHT_RELAY_PORT` to your project's `.env`:

```
SHIPLIGHT_RELAY_PORT=15170
```

Reconnect the MCP server (e.g. `/mcp` in Claude Code) — the relay server starts automatically on the specified port.

**3. Configure the extension**

- Right-click the Shiplight extension icon → **Options**
- Set **Relay Server Port** to `15170` (must match your `.env`)
- Click **Save Settings**

**4. Attach a tab**

Navigate to the page you want to control, then click the Shiplight extension icon. The badge shows **ON** (orange) when the tab is attached and connected.

Badge indicators:

- **ON** (orange) — tab attached and connected
- **…** (yellow) — reconnecting
- **!** (red) — error, check extension options
