Browser Automation
Launch a browser, interact with any web app, and verify UI state — all driven by your AI coding agent.
What Your Agent Can Do
- Open a browser and navigate to your app
- Inspect page state via screenshots and DOM structure
- Interact with elements: click, type, select, scroll, upload files
- Assert UI state with natural language (e.g., "the login form should be visible")
- Capture console errors and network requests
Example prompt:
Add a "Forgot Password?" link below the login form. After implementing,
use Shiplight to verify your implementation in the browser.Session Tools
| Tool | Description |
|---|---|
new_session | Create a browser session with optional device emulation and auto-login |
close_session | Close a browser session |
close_all | Close all browser sessions |
get_session_state | Get current URL and session info |
save_storage_state | Save cookies/localStorage for fast session restore |
Page Inspection
| Tool | Description |
|---|---|
navigate | Navigate to a URL |
get_page_info | Get current page URL and title |
get_dom | DOM tree with interactive element indices |
take_screenshot | Set-of-Mark screenshot matching element indices |
get_locator | Extract Playwright locator/xpath for an element |
Performing Actions
Shiplight can interact with any element on the page using natural language. Examples:
- "Click the Sign In button"
- "Type 'hello@example.com' in the email field"
- "Select 'Monthly' from the billing dropdown"
- "Upload the file at /tmp/report.pdf"
- "Scroll down to the pricing section"
- "Press Enter to submit the form"
- "Go back to the previous page"
AI-Powered Assertions & Extraction
Shiplight uses a secondary AI model to reason about the page for verification and data extraction. Examples:
- "Verify the error message is not visible"
- "Check that the user's name appears in the top right corner"
- "Assert the form submission was successful"
- "Extract the order total into a variable"
- "Wait until the loading spinner disappears"
TIP
Basic interactions (clicks, typing, scrolling) work without API keys. AI-powered assertions and extraction require GOOGLE_API_KEY or ANTHROPIC_API_KEY.
Debugging Tools
| Tool | Description |
|---|---|
get_browser_console_logs | Get browser console output with filtering |
get_browser_network_logs | Get network requests with status filtering |
clear_logs | Clear console and network logs |
Configuration
All configuration is done through environment variables in your MCP server config.
Environment Variables
| Variable | Required | Description | Default |
|---|---|---|---|
GOOGLE_API_KEY | For AI-powered actions | Google AI API key | — |
ANTHROPIC_API_KEY | (one of these two) | Anthropic API key | — |
WEB_AGENT_MODEL | If using AI tools | AI model for the web agent | — |
PWDEBUG | No | Set to console to enable Playwright debug logging | — |
AI Model Options
| Provider | API Key | Supported Models |
|---|---|---|
GOOGLE_API_KEY | gemini-2.5-pro, gemini-3-pro-preview | |
| Anthropic | ANTHROPIC_API_KEY | claude-haiku-4-5, claude-sonnet-4-6, claude-opus-4-6 |