Six Years of Hacker News, One Rename

April 24, 2026 · Leon Ting · 5 min read

The last post (Compile Once. Run Forever. Diff the Drift.) quoted an 849× per-drift advantage for the Tap router over naive LLM re-scrape at 100 queries per drift. The number that matters more is the one we didn't measure: how often drift actually happens.

Turns out the real answer — at least for the sites we tested — is much less often than you'd think.

Six years of HN in one table

WebArchive has HTML snapshots of Hacker News from basically forever. We fetched four of them — 2018, 2020, 2022, 2024 — and diffed the CSS classes that the Tap scraper depends on.

Selector	2018	2020	2022	2024
`.athing` (row container)	✓	✓	✓	✓
`.score` (points)	✓	✓	✓	✓
`.hnuser` (author)	✓	✓	✓	✓
`.age`	✓	✓	✓	✓
`.subtext`	✓	✓	✓	✓
`.storylink` (title anchor)	✓	✓	—	—
`.titlelink`	—	—	✓	—
`.titleline`	—	—	—	✓

In six years, five of the six tap-critical classes didn't move. One of them — the title anchor — got renamed twice. That's it. Everything else still works.

What that drift actually costs

We took the 2018-era tap (using .storylink) and ran it against live HN today. V caught the drift immediately:

shape errors:
  - row 0 has empty required field "title"
  ...
field disagreements:
  - title: 30/30 rows disagree with authoritative
    (observed="" authoritative="GPT-5.5")

One round of heal through Sonnet, 14,581 tokens, and the scraper was healthy again — 30/30 rows matched against Firebase. Same ballpark as the synthetic mutation numbers in the prior post (~14,100 ± 200 tokens). The synthetic benchmark wasn't lying.

What was interesting was how Sonnet patched it. Not a rename. A fallback:

// before
const titleEl = r.querySelector('.storylink');

// after
const titleEl = r.querySelector('.titleline > a')
             || r.querySelector('.storylink');

Sonnet kept the legacy selector as a safety net. Either the site mid-migration back or the new selector covers it. It's a more careful patch than what the same model produces for a pure synthetic rename — as if the model reads "this has happened before, it'll happen again."

What this means for amortization

The previous post used 100 queries per drift as the amortization window. That number came out of nowhere — it was a round number that made the table legible. The WebArchive data suggests it's way too conservative for stable sites.

If you run an HN scraper once a day and HN renames a tap-critical class once in six years, then N = 2,190 queries per drift. The Tap-router advantage at that rate:

Scenario	Queries / drift	Arm A cumulative	Arm C cumulative	Advantage
Blog claim	100	962,500	1,134	849×
Daily HN scrape, 6-year drift cycle	2,190	21,078,750	1,134	18,587×
Hourly scrape	52,560	505,890,000	1,134	446,111×

Hourly on HN for one drift cycle is about half a billion tokens with a naive agent versus a thousand and change with a compiled tap plus one heal. That's not an argument about compilers vs interpreters anymore. That's an argument about physics — one number has a bounded constant, the other grows unbounded with time.

Objection: HN is unusually stable

Yes it is. HN hasn't been redesigned in two decades. Reddit is drift-ier (we've seen Reddit move from .Post__title to [data-testid=post-title] within a year). GitHub redesigns a surface every year or so. Modern SPA sites with CSS-in-JS generate classes fresh on every deploy.

But notice the shape of the argument: Tap's economics get worse as drift gets more frequent. At one drift per day per site, the amortization advantage compresses. At ten drifts per day, the two approaches converge.

So the question becomes: what's the actual distribution of drift rates across the web? We have an HN datapoint (∼1 rename / 6 years) and some soft evidence that Reddit/GitHub drift more like once a year. We don't have a systematic number yet. If someone wants to study that empirically, it would settle the debate faster than any amount of blog math.

One meta-observation

When people imagine AI browser agents, the failure mode they picture is "the site changes and the scraper breaks." That's real, but it implies drift is a frequent surprise. The data says drift is a rare, noisy surprise — it happens, but maybe once a quarter on stable sites, maybe once a month on fast-moving ones. Between drift events, the scraper is running the same extraction over and over.

A naive agent pays for every repetition. A compiled tap pays for the rare exceptions. Over any time horizon longer than the drift interval, that's the whole ball game.

The 4-arm reproducibility kit (verifier calibration, mutation generator, per-arm runners) is open under Apache-2.0 in tap-skills/experiments/. The full methodology and amortization math are in the prior tokens-to-recover post.