Skip to content

YAML E2E Test Format

Why YAML E2E Tests?

🔄 Self-healing — no brittle selectors. Steps are cached for speed, but when a locator breaks the AI re-reads the page and resolves the intent automatically. No flaky tests, no selector maintenance.

💬 Readable in natural language. Every step describes what should happen, not how. Agents can author and maintain the tests, while humans can review them like specs and adjust details when needed.

YAML E2E tests are end-to-end browser tests authored in YAML instead of Playwright code. They are designed for coding agents to create and maintain, with humans in control through readable diffs, local runs, and the visual debugger. Shiplight tests are intent-driven — every step is a natural language description of what should happen. The AI reads the page and figures out the rest. For speed, steps can be enriched with action: or js: caches that replay deterministically (<1s), with automatic AI fallback when locators go stale (self-healing).

Full Spec & Examples

For the complete language specification, see the YAML E2E Test Language Spec. For ready-to-run examples, see the examples repo.

No lock-in: YAML E2E tests can be run directly with the Shiplight CLI (shiplightai), or transpiled to standard Playwright test files that run independently — fully compatible, no runtime dependency. You can eject at any time.

Basic Structure

yaml
goal: Description of what this test verifies
base_url: https://your-app.com

statements:
  - URL: /starting-page
  - intent: Step described in natural language
  - intent: Another step
  - VERIFY: Expected outcome

teardown:
  - intent: Clean up step
FieldRequiredDescription
goalYesTest description (used as the Playwright test name)
base_urlNoBase URL for the app under test. Can also be set via use: { baseURL } in playwright.config.ts
statementsYesList of test steps
teardownNoSteps that always run after the test (like finally)

Basic Test

Every line is a natural language instruction. The AI resolves each one at runtime by looking at the page and performing the right action.

yaml
goal: Verify user can create a new project
base_url: https://app.example.com

statements:
  - URL: /projects
  - intent: Click the "New Project" button
  - intent: Enter "My Test Project" in the project name field
  - intent: Select "Public" from the visibility dropdown
  - intent: Click "Create"
  - VERIFY: Project page shows title 'My Test Project'
teardown:
  - intent: Delete the created project

Enriched Test

After exploring the UI with Shiplight MCP tools, the coding agent enriches natural language steps with action caches for deterministic, fast replay:

yaml
goal: Verify user can create a new project
base_url: https://app.example.com

statements:
  - URL: /projects
  - STEP: Create project
    statements:
      - intent: Click the New Project button
        action: click
        locator: "getByRole('button', { name: 'New Project' })"
      - intent: Enter "My Test Project" in the project name field
        action: input_text
        text: "My Test Project"
        locator: "getByRole('textbox', { name: 'Project name' })"
      - intent: Click Create
        action: click
        locator: "getByRole('button', { name: 'Create' })"
  - VERIFY: Project page shows title 'My Test Project'
teardown:
  - intent: Delete the created project
  • ACTION statements (action: or js:, <1s each) — fast deterministic replay, with automatic AI fallback if the locator fails (self-healing)
  • VERIFY statements — AI-powered natural language assertions. Can include js: to speed up simple checks, with automatic fallback to AI verification
  • DRAFT statements (natural language, ~5-10s each) — the AI reads the page and figures out what to do. Used for steps not yet enriched with action caches

Locators Are a Cache

Locators are a performance cache, not a hard dependency. When the UI changes and a locator becomes stale, Shiplight's agentic layer auto-heals by falling back to the natural language description to find the right element.

However, when a locator is permanently changed (e.g., a button was renamed or moved), the cached locator will fail on every run. When running on Shiplight Cloud, the platform self-updates the cached locator after a successful self-heal — so future runs replay at full speed again without manual intervention. This self-adjusting behavior is a key benefit of having a Shiplight Cloud account.

Statement Types

TypeSyntaxDescription
ACTION (action:)- intent: Enter email + action: input_textFast replay with AI self-healing fallback
ACTION (js:)- intent: Click login + js: "await ..."Fast replay (Playwright code) with AI self-healing
VERIFY- VERIFY: page shows welcome messageAI assertion, optional js: cache
DRAFT- intent: Click the login buttonAI resolves at runtime (~5-10s)
URL- URL: /pathNavigation shorthand
Code- description: ... + js: await request.get(...)Inline Playwright code (no self-healing)
STEP- STEP: Login + statements: [...]Group related actions
IF/ELSE- IF: cookie banner is visible + THEN: [...]Conditional execution
WHILE- WHILE: more items to load + DO: [...]Repeat until condition
Function- call: "file#export" + args: [...]Call custom TypeScript function
Template- template: ./path.yamlInline reusable statement flow

VERIFY

Asserts a condition using AI. Use the VERIFY: shorthand (unquoted key):

yaml
statements:
  - VERIFY: The success message is displayed
  - VERIFY: The order total is $49.99
    js: "await expect(page.getByTestId('order-total')).toHaveText('$49.99')"

The js: cache speeds up simple checks. If the js: assertion fails, it automatically falls back to AI verification using the natural language statement.

ACTION

Fast deterministic replay (<1s) with AI self-healing fallback. Use the structured action: form for all supported actions:

yaml
statements:
  - intent: Type email address
    action: input_text
    text: "{{USER_EMAIL}}"
    locator: "getByLabel('Email')"

intent describes what the step should accomplish in natural language; the action:/locator: field is a cache for fast replay. When the cache fails (e.g., a locator becomes stale), Shiplight's agentic layer falls back to the intent to self-heal.

For complex interactions that don't map to a supported action (e.g., drag-and-drop), use the description: + js: code step — but note raw JS does not self-heal:

yaml
statements:
  - description: Drag the card to the Done column
    js: |
      const card = page.getByText('My Task');
      const target = page.getByTestId('column-done');
      await card.dragTo(target);

STEP (grouping)

Groups related statements under a label.

yaml
statements:
  - STEP: Fill in the registration form
    statements:
      - intent: Type "John" in the first name field
      - intent: Type "Doe" in the last name field
      - intent: Type "john@example.com" in the email field

Frames

For elements inside iframes, use frame_path with action: form:

yaml
- intent: Click Hello inside iframe
  action: click
  frame_path:
    - "iframe#main"
  locator: "getByText('Hello')"

Conditional Logic

Handle optional UI elements with IF/ELSE:

yaml
statements:
  - IF: cookie consent dialog is visible
    THEN:
      - intent: Click "Accept All"
  - IF: user is logged in
    THEN:
      - intent: Click the logout button
    ELSE:
      - intent: Click the login button
      - intent: Enter credentials and submit

Conditions are evaluated by the AI at runtime using the current page state. JavaScript conditions are also supported with the js: prefix:

yaml
- IF: "js: testContext.retryCount < 3"
  THEN:
    - intent: Click the retry button

WARNING

js: conditions have no AI fallback — if the JavaScript fails, the condition fails. Avoid brittle DOM checks (e.g., document.querySelector('.some-class')). Use js: only for simple, reliable checks like URL matching or counters. For UI state checks, prefer natural language conditions.

Loops

Repeat actions until a condition is met with WHILE:

yaml
statements:
  - WHILE: "Load More" button is visible
    DO:
      - intent: Click the "Load More" button
      - intent: Wait for new items to appear
    timeout_ms: 30000
  - VERIFY: all items are loaded

JavaScript conditions work in loops too (same js: caveat applies):

yaml
- WHILE: "js: testContext.itemCount < 10"
  DO:
    - intent: Click "Load More"
    - description: Increment the item counter
      js: "testContext.itemCount = (testContext.itemCount || 0) + 1"

Extensions

Custom Test Name

Override the Playwright test name (defaults to goal):

yaml
name: Login with valid credentials
goal: Verify login flow works
base_url: https://example.com
statements:
  - URL: /
  - ...

Tags

Add Playwright tags for filtering with --grep:

yaml
goal: Login test
base_url: https://example.com

tags:
  - smoke
  - auth
statements:
  - URL: /
  - ...

Run: npx shiplight test --grep @smoke

Playwright Fixtures

Pass options to test.use():

yaml
goal: Mobile French layout
base_url: https://example.com

use:
  viewport:
    width: 375
    height: 812
  locale: fr-FR
statements:
  - URL: /
  - ...

Variables

Use {{VAR_NAME}} to reference variables at runtime. Variables come from two sources: the project's pre-defined variables in playwright.config.ts, or values saved during the test run (e.g., via save_variable or Extract actions).

yaml
statements:
  - intent: Type username
    action: input_text
    text: "{{TEST_USER}}"
    locator: "getByLabel('Username')"

Define variables in playwright.config.ts:

ts
// playwright.config.ts
export default defineConfig({
  projects: [
    {
      name: "default",
      use: {
        variables: {
          TEST_USER: process.env.TEST_USER || "admin",
          TEST_PASS: { value: process.env.TEST_PASS || "secret", sensitive: true },
        },
      },
    },
  ],
});

Variables marked sensitive: true are masked in logs and reports.

Templates

Extract reusable flows into template files and include them with template:.

Template file (templates/login.yaml):

yaml
params:
  - username
  - password

statements:
  - intent: Enter username
    action: input_text
    locator: "getByLabel('Username')"
    text: "<<username>>"
  - intent: Enter password
    action: input_text
    locator: "getByLabel('Password')"
    text: "<<password>>"
  - intent: Click login
    action: click
    locator: "getByRole('button', { name: 'Log in' }).first()"

Using the template:

yaml
goal: Purchase flow
base_url: https://example.com

statements:
  - URL: /
  - template: ../templates/login.yaml
    params:
      username: "{{TEST_USER}}"
      password: "{{TEST_PASS}}"
  - intent: Navigate to the checkout page
  - VERIFY: Order summary is displayed

Template params (<<username>>) are substituted at transpile time. Environment variables ({{TEST_USER}}) pass through to the generated code for runtime resolution.

Templates can be nested (max depth: 5) and circular references are detected.

Custom Functions

Call TypeScript functions from YAML using the call field with file#export syntax:

yaml
statements:
  - intent: Seed test data
    call: "../helpers/seed.ts#create_test_user"
    args: [page, testContext, "test@example.com"]

Inside your function, use testContext to read and write runtime variables:

ts
// helpers/seed.ts
export async function create_test_user(page, testContext, email: string) {
  // Read a variable
  const baseUrl = testContext.BASE_URL;

  // Write a variable (available to subsequent YAML steps as {{userId}})
  testContext.userId = "user-123";
}

Each value in args maps directly to a parameter in the function signature. System objects (page, request, testContext) are passed as-is, strings are quoted, and numbers stay numeric.

The Enrichment Workflow

  1. Draft — The agent writes tests in natural language (DRAFT statements)
  2. Explore — The agent uses inspect_page and act to walk through the UI
  3. Collect — The agent uses get_locators to capture element locators and Playwright code
  4. Enrich — The agent replaces DRAFT steps with ACTION statements (intent: + action:/locator:)
  5. Result — Tests run 10x faster with deterministic replay

DRAFT and ACTION statements can be mixed in the same test. The agent starts with all natural language, then selectively enriches the most-used flows.

Shiplight Fixture

The shiplightai package provides a Playwright fixture that extends the standard test object with additional capabilities. These are configured via the use: block in your YAML E2E test or in playwright.config.ts.

Everything above this section is the YAML language — it defines what your test does. This section covers the fixture — the runtime that executes the test, providing authentication, Chrome extension support, and AI agent integration.

Authentication (auth)

Automatically log in before the test runs. Point to a TypeScript module that exports a login() function returning a storage state file path:

yaml
goal: Verify dashboard after login
base_url: https://app.example.com

use:
  auth: ./auth.login.ts
  args:
    username: "{{TEST_USER}}"
    password: "{{TEST_PASS}}"
statements:
  - URL: /dashboard
  - VERIFY: Dashboard is displayed

The login() function handles login and returns a storage state for the test to run.

Chrome Extension Testing (extensionDir)

Load an unpacked Chrome extension into the browser:

yaml
goal: Verify extension injects banner
base_url: https://example.com

use:
  extensionDir: ./my-extension
statements:
  - URL: /
  - VERIFY: Extension banner is visible at the top of the page

The fixture launches a persistent Chromium context with --load-extension in headed mode (headless Chrome cannot load extensions).

Persistent Chrome Profile (userDataDir)

Reuse a Chrome profile directory across test runs. Works with or without extensions — useful for Google OAuth, cached sessions, or any state that lives in the Chrome profile:

yaml
goal: Verify app with persistent login
base_url: https://app.example.com

use:
  userDataDir: ./chrome-profile
statements:
  - URL: /
  - VERIFY: User is already logged in

Can also be combined with extensionDir:

yaml
goal: Verify extension with cached state
base_url: https://example.com

use:
  extensionDir: ./my-extension
  userDataDir: ./chrome-profile
statements:
  - URL: /
  - VERIFY: Extension shows saved settings

Extension Storage State (extensionStorageState)

Inject cookies into a persistent extension context (since Playwright's storageState option doesn't work with persistent contexts):

yaml
goal: Test extension on authenticated page
base_url: https://app.example.com

use:
  extensionDir: ./my-extension
  extensionStorageState: ./auth/storage-state.json
statements:
  - URL: /
  - VERIFY: User is logged in and extension is active

Test Context (testContext)

The testContext fixture provides a shared variable store accessible from YAML steps, custom functions, and inline code. It supports property-style access:

ts
// In a custom function
export async function setup_user(page, testContext) {
  testContext.userId = "user-123"; // write
  const email = testContext.userEmail; // read
}

Variables set on testContext are available in YAML as {{variableName}}, and vice versa. The agent's $variableName resolves from the same store.

Agent (agent)

The agent fixture provides the AI for actions like VERIFY, intent: resolution, and ai_extract. It shares the same variable store as testContext. Configured automatically from environment variables:

VariableDescription
GOOGLE_API_KEY or ANTHROPIC_API_KEYAt least one required — model is auto-detected
WEB_AGENT_MODELOverride model selection

AI features (DRAFT statements, VERIFY, ai_extract) require an API key. ACTION statements with js: or action: run without one.

For full details on the agent's capabilities, see the Agent SDK documentation.

Released under the MIT License.