What is Shiplight AI?

Shiplight AI provides autonomous AI agents for QA testing. It connects to your AI coding agent (Claude Code, Cursor, Windsurf) via MCP, enabling your agent to validate UI changes in a real browser, write test cases in natural language, and automatically maintain tests as your app evolves.

How does Shiplight AI work with MCP?

Shiplight connects to MCP-compatible AI coding assistants like Claude Code, Cursor, and Windsurf. After your agent implements a feature, it can open a browser, verify changes work, and create automated test cases — all without leaving the coding workflow.

Can I write tests without code?

Yes. Shiplight supports natural language YAML test flows where you write test steps in plain English. The cloud platform also offers a no-code visual test editor with recording, drag-and-drop, and AI-powered test generation.

Does Shiplight AI support CI/CD integration?

Yes. Shiplight integrates with GitHub Actions and other CI pipelines. You can trigger test runs automatically on pull requests, schedule recurring test runs, and get AI-powered test failure summaries.

YAML Test Format

View as Markdown

Shiplight tests are intent-driven — every step is a natural language description of what should happen. The AI reads the page and figures out the rest. For speed, steps can be enriched with action: or js: caches that replay deterministically (<1s), with automatic AI fallback when locators go stale (self-healing).

Full Spec & Examples

For the complete language specification, see the YAML Test Language Spec. For ready-to-run examples, see the examples repo.

No lock-in: YAML tests can be run directly with the Shiplight CLI (shiplightai), or transpiled to standard Playwright test files that run independently — fully compatible, no runtime dependency. You can eject at any time.

Basic Structure

yaml

goal: Description of what this test verifies
base_url: https://your-app.com

statements:
  - URL: /starting-page
  - intent: Step described in natural language
  - intent: Another step
  - VERIFY: Expected outcome

teardown:
  - intent: Clean up step

Field	Required	Description
`goal`	Yes	Test description (used as the Playwright test name)
`base_url`	No	Base URL for the app under test. Can also be set via `use: { baseURL }` in `playwright.config.ts`
`statements`	Yes	List of test steps
`teardown`	No	Steps that always run after the test (like `finally`)

Basic Test

Every line is a natural language instruction. The AI resolves each one at runtime by looking at the page and performing the right action.

yaml

goal: Verify user can create a new project
base_url: https://app.example.com

statements:
  - URL: /projects
  - intent: Click the "New Project" button
  - intent: Enter "My Test Project" in the project name field
  - intent: Select "Public" from the visibility dropdown
  - intent: Click "Create"
  - VERIFY: Project page shows title 'My Test Project'
teardown:
  - intent: Delete the created project

Enriched Test

After exploring the UI with Shiplight MCP tools, the coding agent enriches natural language steps with action caches for deterministic, fast replay:

yaml

goal: Verify user can create a new project
base_url: https://app.example.com

statements:
  - URL: /projects
  - STEP: Create project
    statements:
      - intent: Click the New Project button
        action: click
        locator: "getByRole('button', { name: 'New Project' })"
      - intent: Enter project name
        action: input_text
        text: "My Test Project"
        locator: "getByRole('textbox', { name: 'Project name' })"
      - intent: Click Create
        action: click
        locator: "getByRole('button', { name: 'Create' })"
  - VERIFY: Project page shows title 'My Test Project'
teardown:
  - intent: Delete the created project

ACTION statements (action: or js:, <1s each) — fast deterministic replay, with automatic AI fallback if the locator fails (self-healing)
VERIFY statements — AI-powered natural language assertions. Can include js: to speed up simple checks, with automatic fallback to AI verification
DRAFT statements (natural language, ~5-10s each) — the AI reads the page and figures out what to do. Used for steps not yet enriched with action caches

Locators Are a Cache

Locators are a performance cache, not a hard dependency. When the UI changes and a locator becomes stale, Shiplight's agentic layer auto-heals by falling back to the natural language description to find the right element.

However, when a locator is permanently changed (e.g., a button was renamed or moved), the cached locator will fail on every run. When running on Shiplight Cloud, the platform self-updates the cached locator after a successful self-heal — so future runs replay at full speed again without manual intervention. This self-adjusting behavior is a key benefit of having a Shiplight Cloud account.

Statement Types

Type	Syntax	Description
ACTION (`action:`)	`- intent: Enter email` + `action: input_text`	Fast replay with AI self-healing fallback
ACTION (`js:`)	`- intent: Click login` + `js: "await ..."`	Fast replay (Playwright code) with AI self-healing
VERIFY	`- VERIFY: page shows welcome message`	AI assertion, optional `js:` cache
DRAFT	`- intent: Click the login button`	AI resolves at runtime (~5-10s)
URL	`- URL: /path`	Navigation shorthand
CODE	`- CODE: await request.get(...)`	Inline Playwright code
STEP	`- STEP: Login` + `statements: [...]`	Group related actions
IF/ELSE	`- IF: cookie banner is visible` + `THEN: [...]`	Conditional execution
WHILE	`- WHILE: more items to load` + `DO: [...]`	Repeat until condition
Function	`- call: "file#export"` + `args: [...]`	Call custom TypeScript function
Template	`- template: ./path.yaml`	Inline reusable statement flow

VERIFY

Asserts a condition using AI. Use the VERIFY: shorthand (unquoted key):

yaml

statements:
  - VERIFY: The success message is displayed
  - VERIFY: The order total is $49.99
    js: "await expect(page.getByTestId('order-total')).toHaveText('$49.99')"

The js: cache speeds up simple checks. If the js: assertion fails, it automatically falls back to AI verification using the natural language statement.

ACTION

Fast deterministic replay (<1s) with AI self-healing fallback. Two forms are available:

Structured action: form — the preferred format for most actions:

yaml

statements:
  - intent: Type email address
    action: input_text
    text: "user@example.com"
    locator: "getByLabel('Email')"

js: form — for complex interactions that need full Playwright code:

yaml

statements:
  - intent: Drag the card to the Done column
    js: |
      const card = page.getByText('My Task');
      const target = page.getByTestId('column-done');
      await card.dragTo(target);

In both forms, intent describes what the step should accomplish in natural language. The action: or js: field is a cache for fast replay. When the cache fails (e.g., a locator becomes stale), Shiplight's agentic layer falls back to the intent to self-heal.

STEP (grouping)

Groups related statements under a label.

yaml

statements:
  - STEP: Fill in the registration form
    statements:
      - intent: Type "John" in the first name field
      - intent: Type "Doe" in the last name field
      - intent: Type "john@example.com" in the email field

Frames

For elements inside iframes, use frame_path with action: form:

yaml

- intent: Click Hello inside iframe
  action: click
  frame_path:
    - "iframe#main"
  locator: "getByText('Hello')"

Conditional Logic

Handle optional UI elements with IF/ELSE:

yaml

statements:
  - IF: cookie consent dialog is visible
    THEN:
      - intent: Click "Accept All"
  - IF: user is logged in
    THEN:
      - intent: Click the logout button
    ELSE:
      - intent: Click the login button
      - intent: Enter credentials and submit

Conditions are evaluated by the AI at runtime using the current page state. JavaScript conditions are also supported with the js: prefix:

yaml

- IF: "js: testContext.retryCount < 3"
  THEN:
    - intent: Click the retry button

WARNING

js: conditions have no AI fallback — if the JavaScript fails, the condition fails. Avoid brittle DOM checks (e.g., document.querySelector('.some-class')). Use js: only for simple, reliable checks like URL matching or counters. For UI state checks, prefer natural language conditions.

Loops

Repeat actions until a condition is met with WHILE:

yaml

statements:
  - WHILE: "Load More" button is visible
    DO:
      - intent: Click the "Load More" button
      - intent: Wait for new items to appear
    timeout_ms: 30000
  - VERIFY: all items are loaded

JavaScript conditions work in loops too (same js: caveat applies):

yaml

- WHILE: "js: testContext.itemCount < 10"
  DO:
    - intent: Click "Load More"
    - CODE: "testContext.itemCount = (testContext.itemCount || 0) + 1"

Extensions

Custom Test Name

Override the Playwright test name (defaults to goal):

yaml

name: Login with valid credentials
goal: Verify login flow works
base_url: https://example.com
statements:
  - URL: /
  - ...

Playwright Fixtures

Pass options to test.use():

yaml

goal: Mobile French layout
base_url: https://example.com

use:
  viewport:
    width: 375
    height: 812
  locale: fr-FR
statements:
  - URL: /
  - ...

Variables

Use {{VAR_NAME}} to reference variables at runtime. Variables come from two sources: the project's pre-defined variables in playwright.config.ts, or values saved during the test run (e.g., via save_variable or Extract actions).

yaml

statements:
  - intent: Type username
    action: input_text
    text: "{{TEST_USER}}"
    locator: "getByLabel('Username')"

Define variables in playwright.config.ts:

// playwright.config.ts
export default defineConfig({
  projects: [
    {
      name: "default",
      use: {
        variables: {
          TEST_USER: process.env.TEST_USER || "admin",
          TEST_PASS: { value: process.env.TEST_PASS || "secret", sensitive: true },
        },
      },
    },
  ],
});

Variables marked sensitive: true are masked in logs and reports.

Templates

Extract reusable flows into template files and include them with template:.

Template file (`templates/login.yaml`):

yaml

params:
  - username
  - password

statements:
  - intent: Enter username
    action: input_text
    locator: "getByLabel('Username')"
    text: "<<username>>"
  - intent: Enter password
    action: input_text
    locator: "getByLabel('Password')"
    text: "<<password>>"
  - intent: Click login
    js: "await page.getByRole('button', { name: 'Log in' }).first().click({ timeout: 5000 })"

Using the template:

yaml

goal: Purchase flow
base_url: https://example.com

statements:
  - URL: /
  - template: ../templates/login.yaml
    params:
      username: "{{TEST_USER}}"
      password: "{{TEST_PASS}}"
  - intent: Navigate to the checkout page
  - VERIFY: Order summary is displayed

Template params (<<username>>) are substituted at transpile time. Environment variables ({{TEST_USER}}) pass through to the generated code for runtime resolution.

Templates can be nested (max depth: 5) and circular references are detected.

Custom Functions

Call TypeScript functions from YAML using the call field with file#export syntax:

yaml

statements:
  - intent: Seed test data
    call: "../helpers/seed.ts#create_test_user"
    args: [page, testContext, "test@example.com"]

Inside your function, use testContext to read and write runtime variables:

// helpers/seed.ts
export async function create_test_user(page, testContext, email: string) {
  // Read a variable
  const baseUrl = testContext.BASE_URL;

  // Write a variable (available to subsequent YAML steps as {{userId}})
  testContext.userId = "user-123";
}

Each value in args maps directly to a parameter in the function signature. System objects (page, request, testContext) are passed as-is, strings are quoted, and numbers stay numeric.

The Enrichment Workflow

Draft — The agent writes tests in natural language (DRAFT statements)
Explore — The agent uses inspect_page and act to walk through the UI
Collect — The agent uses get_locators to capture element locators and Playwright code
Enrich — The agent replaces DRAFT steps with ACTION statements (intent: + js: or action:)
Result — Tests run 10x faster with deterministic replay

DRAFT and ACTION statements can be mixed in the same test. The agent starts with all natural language, then selectively enriches the most-used flows.

Shiplight Fixture

The shiplightai package provides a Playwright fixture that extends the standard test object with additional capabilities. These are configured via the use: block in your YAML test or in playwright.config.ts.

Everything above this section is the YAML language — it defines what your test does. This section covers the fixture — the runtime that executes the test, providing authentication, Chrome extension support, and AI agent integration.

Authentication (`auth`)

Automatically log in before the test runs. Point to a TypeScript module that exports a login() function returning a storage state file path:

yaml

goal: Verify dashboard after login
base_url: https://app.example.com

use:
  auth: ./auth.login.ts
  args:
    username: "{{TEST_USER}}"
    password: "{{TEST_PASS}}"
statements:
  - URL: /dashboard
  - VERIFY: Dashboard is displayed

The login() function handles login and returns a storage state for the test to run.

Chrome Extension Testing (`extensionDir`)

Load an unpacked Chrome extension into the browser:

yaml

goal: Verify extension injects banner
base_url: https://example.com

use:
  extensionDir: ./my-extension
statements:
  - URL: /
  - VERIFY: Extension banner is visible at the top of the page

The fixture launches a persistent Chromium context with --load-extension in headed mode (headless Chrome cannot load extensions).

Persistent Chrome Profile (`userDataDir`)

Reuse a Chrome profile directory across test runs. Works with or without extensions — useful for Google OAuth, cached sessions, or any state that lives in the Chrome profile:

yaml

goal: Verify app with persistent login
base_url: https://app.example.com

use:
  userDataDir: ./chrome-profile
statements:
  - URL: /
  - VERIFY: User is already logged in

Can also be combined with extensionDir:

yaml

goal: Verify extension with cached state
base_url: https://example.com

use:
  extensionDir: ./my-extension
  userDataDir: ./chrome-profile
statements:
  - URL: /
  - VERIFY: Extension shows saved settings

Extension Storage State (`extensionStorageState`)

Inject cookies into a persistent extension context (since Playwright's storageState option doesn't work with persistent contexts):

yaml

goal: Test extension on authenticated page
base_url: https://app.example.com

use:
  extensionDir: ./my-extension
  extensionStorageState: ./auth/storage-state.json
statements:
  - URL: /
  - VERIFY: User is logged in and extension is active

Test Context (`testContext`)

The testContext fixture provides a shared variable store accessible from YAML steps, custom functions, and inline code. It supports property-style access:

// In a custom function
export async function setup_user(page, testContext) {
  testContext.userId = "user-123"; // write
  const email = testContext.userEmail; // read
}

Variables set on testContext are available in YAML as {{variableName}}, and vice versa. The agent's $variableName resolves from the same store.

Agent (`agent`)

The agent fixture provides the AI for actions like VERIFY, intent: resolution, and ai_extract. It shares the same variable store as testContext. Configured automatically from environment variables:

Variable	Description
`GOOGLE_API_KEY` or `ANTHROPIC_API_KEY`	At least one required — model is auto-detected
`WEB_AGENT_MODEL`	Override model selection

AI features (DRAFT statements, VERIFY, ai_extract) require an API key. ACTION statements with js: or action: run without one.

For full details on the agent's capabilities, see the Agent SDK documentation.

YAML Test Format ​

Basic Structure ​

Basic Test ​

Enriched Test ​

Locators Are a Cache ​

Statement Types ​

VERIFY ​

ACTION ​

STEP (grouping) ​

Frames ​

Conditional Logic ​

Loops ​

Extensions ​

Custom Test Name ​

Tags ​

Playwright Fixtures ​

Variables ​

Templates ​

Template file (templates/login.yaml): ​

Using the template: ​

Custom Functions ​

The Enrichment Workflow ​

Shiplight Fixture ​

Authentication (auth) ​

Chrome Extension Testing (extensionDir) ​

Persistent Chrome Profile (userDataDir) ​

Extension Storage State (extensionStorageState) ​

Test Context (testContext) ​

Agent (agent) ​

YAML Test Format

Basic Structure

Basic Test

Enriched Test

Locators Are a Cache

Statement Types

VERIFY

ACTION

STEP (grouping)

Frames

Conditional Logic

Loops

Extensions

Custom Test Name

Tags

Playwright Fixtures

Variables

Templates

Template file (`templates/login.yaml`):

Using the template:

Custom Functions

The Enrichment Workflow

Shiplight Fixture

Authentication (`auth`)

Chrome Extension Testing (`extensionDir`)

Persistent Chrome Profile (`userDataDir`)

Extension Storage State (`extensionStorageState`)

Test Context (`testContext`)

Agent (`agent`)