Why v2: −87% LoC, 11 ops, 4 verdicts, 8 ADRs

May 4, 2026 · Leon Ting · 6 min read

Tap v2 deletes roughly 85,000 lines of legacy code, ships 12,730 lines of new engine, and collapses the op vocabulary from 24 to 11. The hero positioning — browser automation that runs in your browser, not someone else's cloud — is unchanged. The schema, runtime, and CLI behind it are not.

Schema breaks are expensive, and we don't take them lightly. This one paid for itself before it shipped. Here is why we did it, what changed, and what we are willing to commit to in writing.

The headline number

The v2 branch lands at −87% lines of code. That is not a refactor. The old engine carried three years of layered patches: a 24-op closure that grew by accretion, a W3C Annotation envelope that wrapped a `body` that wrapped an `ops` array, two parallel verdict enums (one for receipts, one for outcomes), three field-level discriminators (intent: read|write, legacy: true, allowUnverifiable: true) doing the work of one type-level discriminated union.

v2 is the same product with the accumulated tax removed. The schema lives in one file (core/types.ts, 367 lines). Five storage entities, eleven ops, four verdicts. Everything else — capabilities, MCP tool projection, lint errors, heal patches — is a derived projection over those entities, not a parallel definition.

What we are willing to commit to (P1–P4)

The parent ADR locks four product guarantees. They are not marketing copy; they are static checks in CI. If any of them fails, the build fails.

P1 — The op closure is closed

The runtime constant OP_NAMES_V2 has exactly 11 entries. An architecture test asserts the length. Adding an op is a schema change with an ADR, not a stealth addition.

export const OP_NAMES_V2 = [
  "fetch", "nav", "wait", "input", "extract", "cookies", "tap",
  "if", "foreach", "parallel",
  "eval",
] as const;

Seven substrate ops cross the runtime RPC boundary. Three control flow ops orchestrate them. One typed-eval escape hatch covers the long tail — value-only, schema-validated output, no side effects allowed by lint.

P2 — Read and write are unrepresentable as the wrong shape

v1 had a string field intent: "read" | "write". A read tap with an act array was a runtime contradiction caught only on execution. v2 makes it a TypeScript discriminated union: the read variant has act?: never and key?: never; the write variant requires both. The type checker rejects invalid combinations at compile time. The lint catches them in plan JSON.

P3 — Doctor verdicts are decided by per-tap CEL, not engine guess

v1 doctor compared substrate state with hard-coded structural diff. False positives on cosmetic site changes were common. v2 ships a per-tap snapshot_equivalent CEL predicate — you declare what counts as "the same answer" for your tap. PoC measurement: 40% false-positive reduction on the first 20 community taps that adopted it.

P4 — Heal escalation is bounded

Heal walks three paths in fixed order: cache hit (0 LLM tokens) → minimal patch (~1.1K tokens) → full rewrite (~14K tokens). Each path is its own state in the K(Δ) accounting; you can read every heal's class from the trace. No stochastic retry loop, no unbounded token spend.

Why a hard break instead of v1.5

We considered the soft path: ship v2 alongside v1, let users migrate when convenient. It got rejected for the same reason the original ADR called the soft path "strictly worse" — it would have left the engine maintaining two parallel runtimes (24 ops and 11 ops, two verdict enums, two heal pipelines) for no engineering gain. Users would have stayed on v1 by default; v2 would have looked like an experimental fork.

Hard break, lockstep release, single migration boundary. tap migrate scan takes care of the auto-migratable plans (W3C envelope wrapping a body that already happens to use only v2 ops). The rest get a re-forge or a hand-rewrite. The migration tool dogfooded itself against the 65-tap community corpus before this post went live.

The 8 follow-up ADRs

The parent v2 schema ADR is one document. Eight more sit beneath it, each gating a downstream decision. They are public for the same reason this post is public: when you ask "why does Tap v2 do X," the answer is one click deep, not a slack DM.

ADR	What it locks
Ecosystem v2 launch	The lockstep release across 5 surfaces (this site, tap-skills, plugins, npm, brew)
Plan versioning	Sequential integer schema version + semver-major package bump
Distribution model	Per-author namespace, trust tiering, curated subtree CI gate
Plugin runtime model	Plugins as MCP sub-servers, not in-process imports
Forge AI lifecycle	Forge as inspect+draft; deterministic templates first, AI long-tail
Error handling philosophy	Two-arm `RuntimeResult`; no silent capability degradation
Auth + multi-user	Local-first license; no cloud user database
Cross-machine intent coord	Server-free dedup over `key` + intent state machine

Read them at /adrs/. Each is 200–600 lines, follows the same Context → Decision → Application → Risks shape, and was written before the corresponding code shipped — not retconned afterward.

What you do today

If you have v0.x packages pinned, nothing breaks. v0.x stays installable forever; we never npm unpublish. The npm deprecate notice points at the migration guide. Upgrade on your schedule.

If you are starting fresh, install v1.0:

$ npm install @taprun/spec@^1 @taprun/from-playwright@^1
$ npx create-tap-script@latest my-tap

If you have local taps to migrate, the path is one CLI call:

$ tap migrate scan      # inventory the v1 corpus

Step-by-step in the Migration guide. The from-stagehand adapter is permanently deprecated — Stagehand requires Browserbase to run, which contradicts the local-first stance the rest of v2 is built around. Use from-playwright instead; the underlying Playwright control flow is what Stagehand wraps anyway.

Closing

Eighty-five thousand lines is a lot of code to delete. None of it was wrong when it was written; it was right for the version of the product that existed at the time. The discipline behind v2 is not "delete more"; it is "let the schema dictate the engine, then write down why the schema is the way it is, then make the type system fail loudly when the engine drifts away."

Browser automation that runs in your browser, not someone else's cloud. Same line as last year. The plumbing behind it is now small enough that one person can read every load-bearing decision in an afternoon — and they should be able to, because every one of those decisions is an ADR away.

Tap v2 — install or upgrade

v1.0 packages are live. The migration is one CLI call for auto-migratable plans, and the v0.x lockfile path stays open if you'd rather wait. Read the design records at /adrs/.

$ curl -fsSL https://taprun.dev/install.sh | sh

taprun.dev · Migration guide · ADRs · GitHub