Why Can't You Just Assert It? — AI Output Simulator

🔁 The Nondeterminism Simulator

Click Run Prompt repeatedly and watch the outputs change — even though the prompt never does.

Your Prompt

💬

Temperature 0.20

0 — Deterministic 🥶 Deterministic 1 — Creative

Lower temperature → more similar outputs. Higher temperature → more varied, sometimes surprising responses.

Hit Run Prompt to see the AI respond…

⚖️ Traditional vs AI Testing

See why a hard assertion that works perfectly in traditional software simply can't work for AI.

📋 Traditional Assertion

# Works perfectly for deterministic software
def test_summarise():
    output = summarise_benefits()
    assert output == "Automated testing saves time by catching bugs early and reducing manual effort."

# Result when run against an AI model:
FAILED — AssertionError
# Even though the model gave a perfectly valid answer!

The model returned a correct, high-quality answer — but it used different words. A hard assert == fails every time.

❌ This approach breaks on every valid AI response that differs by a single word.

🤖 AI-Aware Assertion

# AI outputs need semantic, not lexical, comparison
def test_summarise_ai():
    output = ask_model("Summarise the benefits of automated testing in one sentence.")

    assert semantic_similarity(output, REFERENCE) > 0.85
    assert mentions_any(output, ["testing", "bugs", "quality", "automation"])
    assert word_count(output) < 50
    assert not contains_hallucination(output)

# Result:
PASSED ✅ — all four semantically valid outputs pass

Instead of exact matching, AI tests assert on semantic similarity, keyword presence, format constraints, and factual validation.

✅ This approach passes for any valid response — and catches genuinely bad ones.

🧠 Hallucination Demo

A question with a definite correct answer — but the AI sometimes gets the details subtly wrong. This is why AI outputs need validation, not just format checks.

💬 Who invented the World Wide Web, and in what year?

Hit Ask the AI to see responses…

📋 Test Strategy Cheat Sheet

A quick-reference summary of the strategies that make AI testing tractable. Click any row to expand it.