Automation Guide  ·  AI Testing

How to Automate This?

Playwright examples for testing non-deterministic AI UIs across TypeScript, .NET, Python, and JavaScript.

Start by navigating to the deployed URL as the first step in your test — for example, using a goto (or GotoAsync) step in Playwright.

The core challenge: AI responses are non-deterministic — the same prompt will produce different (but often equally valid) answers on every run. Exact string matching will cause your tests to fail even when the AI is working correctly. Instead, use a semantic or fuzzy comparison library to assert that the response is close enough to an expected value.

The examples below show how to navigate to this demo, click the button to get a response, and assert that the output is meaningfully similar to a reference answer — rather than requiring an exact match.

Playwright examples

Install: npm install string-similarity @types/string-similarity

// automate-ai-demo.spec.ts
import { test, expect } from '@playwright/test';
import stringSimilarity from 'string-similarity';

test('AI response is semantically similar to expected output', async ({ page }) => {
  // 1. Navigate to the demo
  await page.goto('https://buttered-spuds.github.io/ai-non-determinism-demo/');

  // 2. The prompt is pre-filled; click Run Prompt to submit it
  await page.getByRole('button', { name: 'Run Prompt', exact: true }).click();

  // 3. Wait for and capture the response
  const responseCard = page.getByRole('article').first();
  await expect(responseCard).toBeVisible();
  const responseText = (await responseCard.getByRole('blockquote', { name: 'AI response' }).textContent() ?? '').replace(/^"|"$/g, '');

  // 4. Use fuzzy/semantic comparison instead of an exact match
  const expected = 'Automated testing saves time by catching bugs early and reducing manual effort.';
  const similarity = stringSimilarity.compareTwoStrings(responseText, expected);

  // compareTwoStrings returns 0–1; assert the response is meaningfully similar
  expect(similarity, `Response was not similar enough. Got: "${responseText}"`).toBeGreaterThan(0.5);
});

Tips for testing AI-powered UIs

  • Never use exact-text assertions. AI outputs vary by design — use fuzzy or semantic comparison to check that the response is close enough, not identical.
  • Assert on structure, not content. Check that a response card appeared, that it is non-empty, and that it does not contain error states.
  • Use statistical thresholds. Run the prompt multiple times and assert that the pass rate meets your quality bar (e.g. ≥ 80 %).
  • Separate deterministic from non-deterministic checks. UI structure (buttons, layout, labels) can be asserted exactly; AI content cannot.