Skip to main content
Most API tools: run twice, get different results. Dino: run twice, get the same results. This is not a marketing claim. It is an engineering constraint enforced across every module, every test, every release.

Why Determinism Matters

API quality tools that produce flaky results get disabled. The pattern is always the same:
  1. Tool reports 5 issues on Monday
  2. Same tool reports 3 different issues on Tuesday — nothing changed
  3. Team stops trusting the tool
  4. Team disables the tool in CI
  5. API quality degrades silently
Dino breaks this cycle. If dino scan finds 3 issues today, it finds the same 3 issues tomorrow — unless your API changed. When a new finding appears, it is genuinely new.

How We Enforce It

1

Injectable Seams

No module in Dino calls Date.now(), Math.random(), or setTimeout directly. Every source of non-determinism is injected through explicit interfaces:
  • Clock — all timestamps come from an injectable clock. Tests control time. Production uses the system clock. But the code never reaches for it directly.
  • RandomSource — any randomness flows through an injectable source. Tests use seeded random. Production uses crypto random.
  • Timer — timeouts and delays go through an injectable timer. Tests advance time without waiting.
A bare Date.now() in any module is a CI failure.
2

Stable Fingerprints

Every finding has a fingerprint — a stable hash derived from the finding’s content, not from when or where it was found.Same API state = same fingerprints. This means:
  • Diff between scans shows real changes, not noise
  • CI can gate on “new findings only” without false positives
  • Historical tracking works because identifiers are stable
3

Snapshot Pinning

Schema snapshots are immutable once captured. Diffs, lints, and changelogs compare against pinned baselines — not live re-fetches.This eliminates race conditions: your API changing between scan start and comparison.

AI Reasoning Is Additive

All findings, fingerprints, health scores, and diffs are computed before AI reasoning touches anything. The AI layer receives findings as input — it does not produce them.
AI descriptions are additive metadata. They make findings easier to understand. They don’t change what was found, how it was scored, or whether it passes a gate.
Disable AI reasoning entirely, or if the AI provider is down — Dino produces the same findings with the same fingerprints. You lose natural-language explanations. You keep everything else.
The @dino/reasoning package is optional. The core pipeline — discovery, agent execution, catalog, report — has zero AI dependencies.

What We Test For

3,000+ tests

Over 3,000 tests across 420+ suites. Unit, contract, integration, invariant, and fault injection.

False output is P0

A wrong-but-clean result — Dino reports “all clear” when issues exist — is our highest-priority bug class. Treated as a production incident.

Test isolation

Every suite runs with randomize: true. Test order is shuffled on every run. Order-dependent tests fail in CI.

Boundary contracts

Critical modules have boundary-contract tests verifying behavior at trust boundaries — where input enters, where output leaves.

The False Output Problem

False output is the #1 defect class in API quality tools:
  • Tool says your API handles nulls correctly. It does not.
  • Tool says rate limiting is enforced. It is not.
  • Tool says no breaking changes. There are three.
The result is worse than having no tool at all — you have false confidence. We treat every false output as a P0 bug: regression test, root cause fix, recurrence verification.
We don’t claim zero false outputs. We claim false output is our most aggressively hunted bug class. When you find one, file an issue — we treat it as urgent.

What This Means in Practice

Run dino scan today. Note the findings and fingerprints. Run it tomorrow without changing your API. You get:
  • Same number of findings
  • Same fingerprints
  • Same health scores
  • Same report structure
Change one field — add a nullable argument, deprecate a mutation, remove a type. Run again:
  • Same findings for everything that didn’t change
  • New findings only for the parts that changed
  • A clean diff showing exactly what’s different
Same input, same output. Different input, precisely different output.