ProductHow it worksPricingBlogDocsLoginFind Your First Bug
Five shapes of useless unit tests illustrated as hollow checkmarks on a CI dashboard that stays green while real bugs slip through
TestingAI

Useless Unit Tests: 5 Patterns That Never Fail

Tom Piaggio
Tom PiaggioCo-Founder at Autonoma

A useless unit test is any test that cannot falsify its subject: it passes regardless of whether the code under test behaves correctly. The five recurring shapes are tests that assert internal state instead of observable behavior, tests that assert on mocks they themselves configured, snapshot tests re-blessed on every change, tests with assertion paths that can never go red (try/catch-swallowed expects, assertions inside callbacks that never fire, unconditional expect(true).toBe(true) equivalents), and the tautological test in its purest form: asserting that the function output equals itself by calling the same implementation to compute the expected value.

These tests exist on every team that has been shipping for more than a year. They are not a junior-developer problem or a process failure. They pass code review. They green the CI. They show up in coverage reports as covered lines. And they verify nothing at all.

The teams this matters most for are not the teams without tests. They are the AI-forward engineering teams at Series A and Series B that ship with Cursor, Claude, and Copilot every day. They have hundreds of tests. Many of them are green. The problem is false confidence, not absence. One engineering team we spoke with put it plainly: "the test passes but the bug ships, and nobody questions it because the suite looked fine."

We built Autonoma as the independent behavioral layer that sits outside unit tests entirely. Our agents read your codebase, plan test scenarios from your actual routes and components, and execute them against your running application. That signal is fundamentally different from a unit test because it is independent of the code (and the AI) that produced the test: a useless unit test cannot catch what it refuses to assert, and an independent behavioral check does not inherit that constraint. But before we get there, it is worth naming the shapes precisely, because teams using vibe-coding workflows are generating these at scale and most have no vocabulary to diagnose them.

The 5 shapes of a useless unit test

Each shape below is self-contained. Every one passes code review. Every one is green. None of them verifies observable behavior.

Shape 1: asserting internal state instead of observable behavior

The test reaches inside the module and checks a private field or an internal variable directly, rather than calling the function and checking what it returns to callers.

// Shape 1: Asserting internal state instead of observable behavior.
//
// The test reaches into the cart's internal `items` array and asserts on it
// directly. It never calls the public interface (`getTotal()`), so a bug in
// `getTotal()` cannot make this test go red.

class ShoppingCart {
  // "private-ish" implementation detail. Tests should not depend on this.
  items: Array<{ price: number; qty: number }> = [];

  add(price: number, qty: number): void {
    this.items.push({ price, qty });
  }

  // The observable behavior the cart actually promises.
  getTotal(): number {
    return this.items.reduce((sum, item) => sum + item.price * item.qty, 0);
  }
}

describe('ShoppingCart (Shape 1: internal-state assertion)', () => {
  it('useless: asserts on the internal items array, never on behavior', () => {
    const cart = new ShoppingCart();
    cart.add(10, 2);
    cart.add(5, 1);

    // We only check the bookkeeping array, not what the user actually sees.
    expect(cart.items).toHaveLength(2);
    expect(cart.items[0]).toEqual({ price: 10, qty: 2 });

    // HOW TO SEE IT FAIL TO FAIL:
    // Change getTotal() to `return 0;` (or any wrong value) and rerun.
    // This test stays green because getTotal() is never exercised.
  });
});

Why it passes code review: it looks thorough. It reaches deep into the implementation. The reviewer sees an assertion and moves on. The problem is that internal state is an implementation detail, not a contract. If you refactor the internals while keeping behavior identical, the test breaks for the wrong reason. Worse, you can corrupt the state by poking it directly while the public behavior remains untested.

Shape 2: asserting on a mock you configured in the same test

The test creates a mock, configures it to return a specific value, calls code that uses the mock, and then asserts that the mock returned that value. The assertion is circular: you told the mock what to return and then confirmed it returned that.

// Shape 2: Asserting on a mock you configured in the same test.
//
// We configure a jest.fn() to return a value, call the code under test, then
// assert that the mock returned the value we told it to return. That is
// circular: we are testing jest's mocking, not `getDisplayName`. The function
// under test's own output is never asserted.

interface UserRepository {
  findName(id: number): string;
}

function getDisplayName(repo: UserRepository, id: number): string {
  const name = repo.findName(id);
  return name.trim().toUpperCase();
}

describe('getDisplayName (Shape 2: mock self-assertion)', () => {
  it('useless: only asserts the mock returned what we configured', () => {
    const findName = jest.fn().mockReturnValue('  ada  ');
    const repo: UserRepository = { findName };

    getDisplayName(repo, 1);

    // Circular: we configured the return value above and now "verify" it.
    expect(findName).toHaveBeenCalledWith(1);
    expect(findName.mock.results[0].value).toBe('  ada  ');

    // HOW TO SEE IT FAIL TO FAIL:
    // There is NO assertion on getDisplayName's output. Break the
    // transformation (e.g. `return name.toLowerCase();`) and this stays green
    // because we never assert the function returns 'ADA'.
  });
});

Why it passes code review: mocks are complex. Reviewers see jest.fn(), see an assertion, and assume something is being verified. The test structure looks right. But the actual code under test, the function that calls the mock, is not being verified at all. Its output is never checked. The assertion is entirely about the mock's return value, which you already hardcoded.

Shape 3: the snapshot test nobody reviews

The test renders a component, serializes it to a snapshot file, and asserts that the output matches the snapshot. On the next change, the developer runs jest --updateSnapshot and re-blesses the new output. The test has no opinion about whether the new snapshot is correct.

// Shape 3: The rubber-stamp snapshot.
//
// A snapshot test asserts "the output matches what it was last time." When the
// component changes, the convention is to rerun with `--updateSnapshot`, which
// silently re-blesses the new output. The test therefore approves any change
// and can never catch a regression on its own.

import { render } from '@testing-library/react';
import * as React from 'react';

function PriceTag({ amount }: { amount: number }): React.ReactElement {
  // A real bug could live here: wrong currency, wrong rounding, wrong label.
  return <span className="price">${amount.toFixed(2)}</span>;
}

describe('PriceTag (Shape 3: rubber-stamp snapshot)', () => {
  it('useless: blesses whatever the component currently renders', () => {
    const { container } = render(<PriceTag amount={19.5} />);

    // Whatever comes out becomes the source of truth on first run.
    expect(container.firstChild).toMatchSnapshot();

    // HOW TO SEE IT FAIL TO FAIL:
    // Change the render output (e.g. `€${amount}` or drop `.toFixed(2)`), then
    // rerun with:  npx jest src/shape3-snapshot.test.ts --updateSnapshot
    // The snapshot is rewritten to the broken output and the test is green.
  });
});

Why it passes code review: snapshot testing sounds principled. Reviewers see that a diff was committed and assume someone looked at it. In practice, teams re-bless snapshots as reflexively as they clear lint warnings. The snapshot captures a structure; it does not encode what that structure is supposed to mean. "It asserts something, but not what it should be asserting" is the exact failure mode here.

Shape 4: the test that cannot go red

This shape has several variants. A try/catch block swallows the assertion error and lets the test pass. An assertion is written after an early return so it never executes. An expect is placed inside a callback that the test runner never calls. Or the test asserts something unconditionally true.

// Shape 4: The test that cannot go red.
//
// Three classic ways an assertion never gets a chance to fail:
//   (a) try/catch swallows the assertion error,
//   (b) an assertion placed after an early return,
//   (c) an `expect` inside a callback that is never invoked.
//
// All three pass no matter how broken `divide` is.

function divide(a: number, b: number): number {
  // Intentionally buggy on purpose-of-demo: should throw on divide-by-zero,
  // and should return a / b. Break it however you like; nothing turns red.
  return a / b;
}

describe('divide (Shape 4: unreachable assertions)', () => {
  it('(a) try/catch swallows the assertion error', () => {
    try {
      expect(divide(10, 2)).toBe(999); // wrong on purpose
    } catch {
      // The AssertionError is caught and discarded. Test passes.
    }
  });

  it('(b) assertion after an early return is never reached', () => {
    const result = divide(10, 2);
    if (result) {
      return; // bails out before the assertion
    }
    expect(result).toBe(999); // unreachable; wrong value never checked
  });

  it('(c) expect inside a callback that is never invoked', () => {
    const onError = () => {
      expect(true).toBe(false); // would fail, but...
    };
    // divide() never calls onError, so the assertion never runs.
    divide(10, 2);
  });

  // HOW TO SEE IT FAIL TO FAIL:
  // Break divide() (e.g. `return a * b;`). All three tests still pass.
});

Why it passes code review: the code path looks plausible. The try/catch was added to "handle errors gracefully." The callback with the assertion looks like async coverage. The assertion after the return looks like thoroughness. None of these reviewers checked whether the assertion can actually fire.

Shape 5: the tautological test

This is the most seductive shape and the hardest to catch. The test calls the function under test to compute the expected value, then calls it again to get the actual value, and asserts they are equal. Of course they are equal: they are the same computation. If the function is wrong, both sides of the assertion are wrong in the same way, and the test still passes.

// Shape 5: The tautological test in its purest form.
//
// The "expected" value is computed by calling the very function under test.
// `actual === expected` is then trivially true for ALL inputs, because both
// sides run the same (possibly broken) code. The assertion is a tautology.

function calculateDiscount(price: number, percentOff: number): number {
  // Whatever this does (right or wrong), the test below will agree with it.
  return price - (price * percentOff) / 100;
}

describe('calculateDiscount (Shape 5: tautological assertion)', () => {
  it('useless: expected value is produced by the function under test', () => {
    const price = 200;
    const percentOff = 25;

    const expected = calculateDiscount(price, percentOff); // same code path
    const actual = calculateDiscount(price, percentOff);

    expect(actual).toBe(expected); // always true; tests nothing

    // HOW TO SEE IT FAIL TO FAIL:
    // Introduce a bug (e.g. `return price + (price * percentOff) / 100;`).
    // Both `expected` and `actual` become equally wrong, so they still match
    // and the test stays green.
  });
});

Why it passes code review: it looks rigorous. There is a function call, an expected value, an actual value, and an assertion. The reviewer rarely asks "where did expected come from?" The tautological test is the useless unit test in its purest form because it cannot detect any regression: the implementation and the expected value move together. For the full before/after treatment and the mutation-score argument, see our companion piece on AI-generated tests that pass but do not assert.

How Autonoma avoids the useless-test trap

The core problem documented above is that unit tests written against the implementation can inherit its blind spots. "Our QA engineers are still finding things" is the sentence we hear from teams that have strong unit-test coverage but have never instrumentally confirmed that their application's observable behavior is correct. Green tests do not mean the product works. They mean the tests passed.

Autonoma is built around a different premise. Our Planner agent reads your actual codebase: routes, components, business logic, and user flows. It plans test scenarios derived from what your application is supposed to do, not from what the existing tests already say. The Automator agent then executes those scenarios against your running application inside a managed preview environment, one that mirrors production infrastructure rather than a localhost stub. The Maintainer agent keeps those tests passing as your code evolves. Existing unit tests are not the oracle for what to verify. That independence is the point. AI verification is only trustworthy when it is independent of the thing being verified. Green means consistency, not correctness.

Diagram contrasting a self-verification loop, where a useless unit test derives its expected value from the same implementation it tests and can never go red, with independent verification, where an expectation derived from intended behavior is compared against the running application and can fail when behavior breaks
A useless unit test closes the loop on itself: the expected value comes from the same code under test, so it can only ever agree. Independent verification derives the expectation from intended behavior and checks it against the running app, so it can actually go red.

Autonoma is not a unit-test runner, not a linter, not an AI code reviewer, and not a mutation-testing tool. Those are all useful. We are the behavioral E2E layer that tells you whether the flows your users actually run work end-to-end, in a real environment, with no hand-holding from the team that wrote the code. That is a complement to unit testing, not a replacement for it. A tautological test is a unit-level problem; Autonoma's answer is independent behavioral verification at the E2E level.

Why AI generation produces them at scale

The five shapes above predate AI code generation, but AI makes them dramatically worse. When a Cursor or Copilot agent writes tests from context that is close to the implementation, it naturally gravitates toward the tautological pattern: the easiest test to generate for a function is one that calls that function. For the full mechanism, see our articles on AI-generated tests that pass but do not assert and why AI test generation produces green tests that let bugs through. The short version: AI generation does not know what a function is supposed to return; it knows what the function does return, and it writes the assertion accordingly. That context-sharing is the exact gap we designed our agents around: a generation tool inherits the implementation's blind spots because it reads the implementation, while Autonoma treats the application flow as the oracle instead of the unit tests those flows were supposed to protect.

The one-line fix for each

Each shape has a fix, and none of them require a new tool.

Map of the five shapes of useless unit tests to their one-line fixes: internal-state assertion maps to asserting the return value, mock self-assertion to asserting the caller's output, rubber-stamp snapshot to a targeted contract assertion, can't-go-red to freeing the assertion and confirming it fails, and the tautological test to hardcoding the expected value from the spec
Each shape maps to a one-line fix: assert observable output, derived from the spec, on a path that can actually fail.

Shape 1 (internal state): delete the assertion on the private field; assert the function's return value or its effect on the public interface instead.

Shape 2 (mock self-assertion): assert the output of the function that calls the mock, not the mock's return value.

Shape 3 (rubber-stamp snapshot): replace the snapshot with a targeted assertion on the specific value that defines the component's contract (a label, a count, a disabled state).

Shape 4 (can't-go-red): delete the try/catch around the assertion, move the assertion before the return, or call the callback explicitly; then confirm the test fails when you break the code.

Shape 5 (tautological): replace the computed expected value with a hardcoded literal; if you cannot state the expected output without calling the implementation, you do not yet have a specification.

For the full ruleset on writing assertions that actually falsify behavior, see our guide on how to write good test assertions. You can also apply these fixes retroactively by checking which tests still pass after you deliberately break the function under test. If a test stays green when its subject is broken, it is useless, and the green-but-broken signal is the fastest way to find them.

The fixes above repair tests one at a time. The structural gap they all share, an assertion source that is not independent of the implementation, is what we built Autonoma for. If your team generates tests at scale, our recommendation is to add it as the behavioral E2E layer alongside the unit suite (a complement, not a replacement): agents that derive expectations from your codebase's intended flows and verify them against the running application, so a test that cannot fail is no longer the only thing standing between a bug and production.

FAQ

A unit test is useless when it cannot go red. Either it has no assertion, its assertion path can never be reached, or its expected value is derived from the same implementation it is supposed to test. Useless unit tests produce no information: they pass whether the code is correct or broken. 'It doesn't cover the business case' is how teams describe these tests after shipping a bug that the suite never flagged.

A tautological test asserts that a function's output equals its own output by calling the implementation twice: once to compute the expected value and once to get the actual value. Because both sides of the assertion come from the same code, the test will pass even if that code contains a regression. It is the useless unit test in its purest form because it is structurally incapable of detecting any bug in the function it appears to test.

Tests always pass for one of three reasons: the code is genuinely correct; the tests are not testing anything meaningful; or the assertions cannot reach a failure state. The second and third reasons are far more common than most teams realize, especially on codebases where tests were generated by AI tools or written quickly to hit a coverage target. If your tests pass before and after you deliberately break the code under test, the tests are useless.

The most direct method is manual mutation: deliberately change a return value or a conditional in the code under test and check whether any test fails. If none do, the tests for that module are useless. Automated mutation testing tools can do this at scale. A faster heuristic: look for tests that assert on mock return values they configured themselves, tests that end with a try/catch around the expect, and tests where the expected value is a function call rather than a literal.

Yes. AI code generation tools naturally produce useless unit tests at scale, because generating a test for a function is easiest when you call that function to derive the expected value, which produces the tautological pattern. They also produce mock-asserting tests (configuring a mock and then asserting it returned what you told it to return) and snapshot tests that get re-blessed on every generation cycle. 'Green but broken' suites have become a defining characteristic of AI-assisted codebases that have not added an independent behavioral verification layer.

Related articles

Shift-left testing pipeline diagram: bugs caught at the PR stage before production for a small engineering team

Shift-Left Testing for Small Engineering Teams in 2026

Shift-left testing for small engineering teams: how 3-6 person startups catch bugs before production without a QA hire, using preview environments and AI.

Happy path testing taxonomy: tree diagram showing happy path, sad path, edge case, and corner case as four coverage branches, with most production bugs living in the non-happy branches.

Happy Path Testing: What It Covers and What It Misses

Happy path testing vs sad path, edge case, and corner case. Canonical taxonomy, golden path explained, and four bugs a happy-path-only suite misses.

Diagram showing a wall of AI-generated pull requests overwhelming a small hand-maintained test suite, with a codebase-aware regression layer intercepting the merge flow

Regression Testing for AI-Generated Code: How to Keep Coverage Current When Agents Ship 100x More PRs

Regression testing AI-generated code: why Playwright suites collapse under agent PR volume and how codebase-aware AI code regression coverage survives drift.

AI E2E testing taxonomy: AI-assisted authoring, autonomous codebase-first testing, runtime exploration, natural-language spec execution, generated test pipelines, visual-AI assertions

AI E2E Testing: What It Actually Means in 2026

AI E2E testing covers six structurally different products: AI-assisted authoring, autonomous codebase-first testing, runtime exploration, natural-language spec execution, generated test pipelines, and visual-AI assertions. Only one is genuinely autonomous end to end.