YAML E2E Test Format
Why YAML E2E Tests?
🔄 Self-healing — no brittle selectors. Steps are cached for speed, but when a locator breaks the AI re-reads the page and resolves the intent automatically. No flaky tests, no selector maintenance.
💬 Readable in natural language. Every step describes what should happen, not how. Agents can author and maintain the tests, while humans can review them like specs and adjust details when needed.
YAML E2E tests are end-to-end browser tests authored in YAML instead of Playwright code. They are designed for coding agents to create and maintain, with humans in control through readable diffs, local runs, and the visual debugger. Shiplight tests are intent-driven — every step is a natural language description of what should happen. The AI reads the page and figures out the rest. For speed, steps can be enriched with action: or js: caches that replay deterministically (<1s), with automatic AI fallback when locators go stale (self-healing).
Full Spec & Examples
For the complete language specification, see the YAML E2E Test Language Spec. For ready-to-run examples, see the examples repo.
No lock-in: YAML E2E tests can be run directly with the Shiplight CLI (shiplightai), or transpiled to standard Playwright test files that run independently — fully compatible, no runtime dependency. You can eject at any time.
Basic Structure
goal: Description of what this test verifies
base_url: https://your-app.com
statements:
- URL: /starting-page
- intent: Step described in natural language
- intent: Another step
- VERIFY: Expected outcome
teardown:
- intent: Clean up step| Field | Required | Description |
|---|---|---|
goal | Yes | Test description (used as the Playwright test name) |
base_url | No | Base URL for the app under test. Can also be set via use: { baseURL } in playwright.config.ts |
statements | Yes | List of test steps |
teardown | No | Steps that always run after the test (like finally) |
Basic Test
Every line is a natural language instruction. The AI resolves each one at runtime by looking at the page and performing the right action.
goal: Verify user can create a new project
base_url: https://app.example.com
statements:
- URL: /projects
- intent: Click the "New Project" button
- intent: Enter "My Test Project" in the project name field
- intent: Select "Public" from the visibility dropdown
- intent: Click "Create"
- VERIFY: Project page shows title 'My Test Project'
teardown:
- intent: Delete the created projectEnriched Test
After exploring the UI with Shiplight MCP tools, the coding agent enriches natural language steps with action caches for deterministic, fast replay:
goal: Verify user can create a new project
base_url: https://app.example.com
statements:
- URL: /projects
- STEP: Create project
statements:
- intent: Click the New Project button
action: click
locator: "getByRole('button', { name: 'New Project' })"
- intent: Enter "My Test Project" in the project name field
action: input_text
text: "My Test Project"
locator: "getByRole('textbox', { name: 'Project name' })"
- intent: Click Create
action: click
locator: "getByRole('button', { name: 'Create' })"
- VERIFY: Project page shows title 'My Test Project'
teardown:
- intent: Delete the created project- ACTION statements (
action:orjs:, <1s each) — fast deterministic replay, with automatic AI fallback if the locator fails (self-healing) - VERIFY statements — AI-powered natural language assertions. Can include
js:to speed up simple checks, with automatic fallback to AI verification - DRAFT statements (natural language, ~5-10s each) — the AI reads the page and figures out what to do. Used for steps not yet enriched with action caches
Locators Are a Cache
Locators are a performance cache, not a hard dependency. When the UI changes and a locator becomes stale, Shiplight's agentic layer auto-heals by falling back to the natural language description to find the right element.
However, when a locator is permanently changed (e.g., a button was renamed or moved), the cached locator will fail on every run. When running on Shiplight Cloud, the platform self-updates the cached locator after a successful self-heal — so future runs replay at full speed again without manual intervention. This self-adjusting behavior is a key benefit of having a Shiplight Cloud account.
Statement Types
| Type | Syntax | Description |
|---|---|---|
ACTION (action:) | - intent: Enter email + action: input_text | Fast replay with AI self-healing fallback |
ACTION (js:) | - intent: Click login + js: "await ..." | Fast replay (Playwright code) with AI self-healing |
| VERIFY | - VERIFY: page shows welcome message | AI assertion, optional js: cache |
| DRAFT | - intent: Click the login button | AI resolves at runtime (~5-10s) |
| URL | - URL: /path | Navigation shorthand |
| Code | - description: ... + js: await request.get(...) | Inline Playwright code (no self-healing) |
| STEP | - STEP: Login + statements: [...] | Group related actions |
| IF/ELSE | - IF: cookie banner is visible + THEN: [...] | Conditional execution |
| WHILE | - WHILE: more items to load + DO: [...] | Repeat until condition |
| Function | - call: "file#export" + args: [...] | Call custom TypeScript function |
| Template | - template: ./path.yaml | Inline reusable statement flow |
VERIFY
Asserts a condition using AI. Use the VERIFY: shorthand (unquoted key):
statements:
- VERIFY: The success message is displayed
- VERIFY: The order total is $49.99
js: "await expect(page.getByTestId('order-total')).toHaveText('$49.99')"The js: cache speeds up simple checks. If the js: assertion fails, it automatically falls back to AI verification using the natural language statement.
ACTION
Fast deterministic replay (<1s) with AI self-healing fallback. Use the structured action: form for all supported actions:
statements:
- intent: Type email address
action: input_text
text: "{{USER_EMAIL}}"
locator: "getByLabel('Email')"intent describes what the step should accomplish in natural language; the action:/locator: field is a cache for fast replay. When the cache fails (e.g., a locator becomes stale), Shiplight's agentic layer falls back to the intent to self-heal.
For complex interactions that don't map to a supported action (e.g., drag-and-drop), use the description: + js: code step — but note raw JS does not self-heal:
statements:
- description: Drag the card to the Done column
js: |
const card = page.getByText('My Task');
const target = page.getByTestId('column-done');
await card.dragTo(target);STEP (grouping)
Groups related statements under a label.
statements:
- STEP: Fill in the registration form
statements:
- intent: Type "John" in the first name field
- intent: Type "Doe" in the last name field
- intent: Type "john@example.com" in the email fieldFrames
For elements inside iframes, use frame_path with action: form:
- intent: Click Hello inside iframe
action: click
frame_path:
- "iframe#main"
locator: "getByText('Hello')"Conditional Logic
Handle optional UI elements with IF/ELSE:
statements:
- IF: cookie consent dialog is visible
THEN:
- intent: Click "Accept All"
- IF: user is logged in
THEN:
- intent: Click the logout button
ELSE:
- intent: Click the login button
- intent: Enter credentials and submitConditions are evaluated by the AI at runtime using the current page state. JavaScript conditions are also supported with the js: prefix:
- IF: "js: testContext.retryCount < 3"
THEN:
- intent: Click the retry buttonWARNING
js: conditions have no AI fallback — if the JavaScript fails, the condition fails. Avoid brittle DOM checks (e.g., document.querySelector('.some-class')). Use js: only for simple, reliable checks like URL matching or counters. For UI state checks, prefer natural language conditions.
Loops
Repeat actions until a condition is met with WHILE:
statements:
- WHILE: "Load More" button is visible
DO:
- intent: Click the "Load More" button
- intent: Wait for new items to appear
timeout_ms: 30000
- VERIFY: all items are loadedJavaScript conditions work in loops too (same js: caveat applies):
- WHILE: "js: testContext.itemCount < 10"
DO:
- intent: Click "Load More"
- description: Increment the item counter
js: "testContext.itemCount = (testContext.itemCount || 0) + 1"Extensions
Custom Test Name
Override the Playwright test name (defaults to goal):
name: Login with valid credentials
goal: Verify login flow works
base_url: https://example.com
statements:
- URL: /
- ...Tags
Add Playwright tags for filtering with --grep:
goal: Login test
base_url: https://example.com
tags:
- smoke
- auth
statements:
- URL: /
- ...Run: npx shiplight test --grep @smoke
Playwright Fixtures
Pass options to test.use():
goal: Mobile French layout
base_url: https://example.com
use:
viewport:
width: 375
height: 812
locale: fr-FR
statements:
- URL: /
- ...Variables
Use {{VAR_NAME}} to reference variables at runtime. Variables come from two sources: the project's pre-defined variables in playwright.config.ts, or values saved during the test run (e.g., via save_variable or Extract actions).
statements:
- intent: Type username
action: input_text
text: "{{TEST_USER}}"
locator: "getByLabel('Username')"Define variables in playwright.config.ts:
// playwright.config.ts
export default defineConfig({
projects: [
{
name: "default",
use: {
variables: {
TEST_USER: process.env.TEST_USER || "admin",
TEST_PASS: { value: process.env.TEST_PASS || "secret", sensitive: true },
},
},
},
],
});Variables marked sensitive: true are masked in logs and reports.
Templates
Extract reusable flows into template files and include them with template:.
Template file (templates/login.yaml):
params:
- username
- password
statements:
- intent: Enter username
action: input_text
locator: "getByLabel('Username')"
text: "<<username>>"
- intent: Enter password
action: input_text
locator: "getByLabel('Password')"
text: "<<password>>"
- intent: Click login
action: click
locator: "getByRole('button', { name: 'Log in' }).first()"Using the template:
goal: Purchase flow
base_url: https://example.com
statements:
- URL: /
- template: ../templates/login.yaml
params:
username: "{{TEST_USER}}"
password: "{{TEST_PASS}}"
- intent: Navigate to the checkout page
- VERIFY: Order summary is displayedTemplate params (<<username>>) are substituted at transpile time. Environment variables ({{TEST_USER}}) pass through to the generated code for runtime resolution.
Templates can be nested (max depth: 5) and circular references are detected.
Custom Functions
Call TypeScript functions from YAML using the call field with file#export syntax:
statements:
- intent: Seed test data
call: "../helpers/seed.ts#create_test_user"
args: [page, testContext, "test@example.com"]Inside your function, use testContext to read and write runtime variables:
// helpers/seed.ts
export async function create_test_user(page, testContext, email: string) {
// Read a variable
const baseUrl = testContext.BASE_URL;
// Write a variable (available to subsequent YAML steps as {{userId}})
testContext.userId = "user-123";
}Each value in args maps directly to a parameter in the function signature. System objects (page, request, testContext) are passed as-is, strings are quoted, and numbers stay numeric.
The Enrichment Workflow
- Draft — The agent writes tests in natural language (DRAFT statements)
- Explore — The agent uses
inspect_pageandactto walk through the UI - Collect — The agent uses
get_locatorsto capture element locators and Playwright code - Enrich — The agent replaces DRAFT steps with ACTION statements (
intent:+action:/locator:) - Result — Tests run 10x faster with deterministic replay
DRAFT and ACTION statements can be mixed in the same test. The agent starts with all natural language, then selectively enriches the most-used flows.
Shiplight Fixture
The shiplightai package provides a Playwright fixture that extends the standard test object with additional capabilities. These are configured via the use: block in your YAML E2E test or in playwright.config.ts.
Everything above this section is the YAML language — it defines what your test does. This section covers the fixture — the runtime that executes the test, providing authentication, Chrome extension support, and AI agent integration.
Authentication (auth)
Automatically log in before the test runs. Point to a TypeScript module that exports a login() function returning a storage state file path:
goal: Verify dashboard after login
base_url: https://app.example.com
use:
auth: ./auth.login.ts
args:
username: "{{TEST_USER}}"
password: "{{TEST_PASS}}"
statements:
- URL: /dashboard
- VERIFY: Dashboard is displayedThe login() function handles login and returns a storage state for the test to run.
Chrome Extension Testing (extensionDir)
Load an unpacked Chrome extension into the browser:
goal: Verify extension injects banner
base_url: https://example.com
use:
extensionDir: ./my-extension
statements:
- URL: /
- VERIFY: Extension banner is visible at the top of the pageThe fixture launches a persistent Chromium context with --load-extension in headed mode (headless Chrome cannot load extensions).
Persistent Chrome Profile (userDataDir)
Reuse a Chrome profile directory across test runs. Works with or without extensions — useful for Google OAuth, cached sessions, or any state that lives in the Chrome profile:
goal: Verify app with persistent login
base_url: https://app.example.com
use:
userDataDir: ./chrome-profile
statements:
- URL: /
- VERIFY: User is already logged inCan also be combined with extensionDir:
goal: Verify extension with cached state
base_url: https://example.com
use:
extensionDir: ./my-extension
userDataDir: ./chrome-profile
statements:
- URL: /
- VERIFY: Extension shows saved settingsExtension Storage State (extensionStorageState)
Inject cookies into a persistent extension context (since Playwright's storageState option doesn't work with persistent contexts):
goal: Test extension on authenticated page
base_url: https://app.example.com
use:
extensionDir: ./my-extension
extensionStorageState: ./auth/storage-state.json
statements:
- URL: /
- VERIFY: User is logged in and extension is activeTest Context (testContext)
The testContext fixture provides a shared variable store accessible from YAML steps, custom functions, and inline code. It supports property-style access:
// In a custom function
export async function setup_user(page, testContext) {
testContext.userId = "user-123"; // write
const email = testContext.userEmail; // read
}Variables set on testContext are available in YAML as {{variableName}}, and vice versa. The agent's $variableName resolves from the same store.
Agent (agent)
The agent fixture provides the AI for actions like VERIFY, intent: resolution, and ai_extract. It shares the same variable store as testContext. Configured automatically from environment variables:
| Variable | Description |
|---|---|
GOOGLE_API_KEY or ANTHROPIC_API_KEY | At least one required — model is auto-detected |
WEB_AGENT_MODEL | Override model selection |
AI features (DRAFT statements, VERIFY, ai_extract) require an API key. ACTION statements with js: or action: run without one.
For full details on the agent's capabilities, see the Agent SDK documentation.