# Browser Automation

<div class="view-markdown-wrapper">
<ViewMarkdown />
</div>

Launch a browser, interact with any web app, and verify UI state — all driven by your AI coding agent.

## What Your Agent Can Do

- Open a browser and navigate to your app
- Inspect page state via screenshots and DOM structure
- Interact with elements: click, type, select, scroll, upload files
- Assert UI state with natural language (e.g., "the login form should be visible")
- Capture console errors and network requests

**Example prompt:**

```
Add a "Forgot Password?" link below the login form. After implementing,
use Shiplight to verify your implementation in the browser.
```

## Session Tools

| Tool                 | Description                                                            |
| -------------------- | ---------------------------------------------------------------------- |
| `new_session`        | Create a browser session with optional device emulation and auto-login |
| `close_session`      | Close a browser session                                                |
| `close_all`          | Close all browser sessions                                             |
| `get_session_state`  | Get current URL and session info                                       |
| `save_storage_state` | Save cookies/localStorage for fast session restore                     |

## Page Inspection

| Tool              | Description                                     |
| ----------------- | ----------------------------------------------- |
| `navigate`        | Navigate to a URL                               |
| `get_page_info`   | Get current page URL and title                  |
| `get_dom`         | DOM tree with interactive element indices       |
| `take_screenshot` | Set-of-Mark screenshot matching element indices |
| `get_locator`     | Extract Playwright locator/xpath for an element |

## Performing Actions

Shiplight can interact with any element on the page using natural language. Examples:

- "Click the Sign In button"
- "Type 'hello@example.com' in the email field"
- "Select 'Monthly' from the billing dropdown"
- "Upload the file at /tmp/report.pdf"
- "Scroll down to the pricing section"
- "Press Enter to submit the form"
- "Go back to the previous page"

### AI-Powered Assertions & Extraction

Shiplight uses a secondary AI model to reason about the page for verification and data extraction. Examples:

- "Verify the error message is not visible"
- "Check that the user's name appears in the top right corner"
- "Assert the form submission was successful"
- "Extract the order total into a variable"
- "Wait until the loading spinner disappears"

::: tip
Basic interactions (clicks, typing, scrolling) work without API keys. AI-powered assertions and extraction require `GOOGLE_API_KEY` or `ANTHROPIC_API_KEY`.
:::

## Debugging Tools

| Tool                       | Description                                |
| -------------------------- | ------------------------------------------ |
| `get_browser_console_logs` | Get browser console output with filtering  |
| `get_browser_network_logs` | Get network requests with status filtering |
| `clear_logs`               | Clear console and network logs             |

## Configuration

All configuration is done through environment variables in your MCP server config.

### Environment Variables

| Variable            | Required               | Description                                                      | Default |
| ------------------- | ---------------------- | ---------------------------------------------------------------- | ------- |
| `GOOGLE_API_KEY`    | For AI-powered actions | [Google AI API key](https://aistudio.google.com/app/apikey)      | —       |
| `ANTHROPIC_API_KEY` | (one of these two)     | [Anthropic API key](https://console.anthropic.com/settings/keys) | —       |
| `WEB_AGENT_MODEL`   | If using AI tools      | AI model for the web agent                                       | —       |
| `PWDEBUG`           | No                     | Set to `console` to enable Playwright debug logging              | —       |

### AI Model Options

| Provider  | API Key             | Supported Models                                           |
| --------- | ------------------- | ---------------------------------------------------------- |
| Google    | `GOOGLE_API_KEY`    | `gemini-2.5-pro`, `gemini-3-pro-preview`                   |
| Anthropic | `ANTHROPIC_API_KEY` | `claude-haiku-4-5`, `claude-sonnet-4-6`, `claude-opus-4-6` |