Your Scraper Is Broken Right Now. You Just Don't Know It Yet.

April 4, 2026 · Leon Ting · 5 min read

Somewhere in your infrastructure, a scraper is returning empty arrays. Your dashboard shows stale numbers. A report your team relies on has been wrong since Tuesday.

You won't find out until someone complains.

The Silent Failure Problem

Most scrapers don't crash loudly. They fail quietly.

"Instead of throwing an error when a page structure changes, they return empty arrays... A scraper that fails silently poisons your data for days or weeks before anyone notices."
— BinaryBits

This happens because scrapers have no health contract. They extract whatever they find — and when the site changes, "whatever they find" is nothing. No error. No alert. Just empty data flowing downstream.

The Maintenance Tax

When you do notice, you're back to fixing selectors. Again.

"Every time a website redesigns or updates their layout, I'm manually fixing selectors and rewriting parts of the workflow. It's eating up hours every month."
— ByteForge, Latenode Community

"Maintaining tests can take up to 50% of the time for QA test automation engineers."
— davert, Dev.to

The loop is always the same: build automation → site changes → selectors break → spend hours fixing → repeat. And the worst part — you're fixing the presentation layer (DOM/CSS), not the data itself.

The AI Agent Tax

AI browser agents (Browser Use, Stagehand, etc.) promise to solve this by re-interpreting the page every run. But they introduce two new problems:

1. Cost compounds.

"The program cost $1.05 to run. So doing it at any scale quickly becomes a little bit silly."
— rozap, Hacker News

2. Reliability degrades at each step.

"If each step has a .95 chance of completing successfully, after not very many steps you have a pretty small overall probability of success."
— rozap, Hacker News

95% per step sounds great. But a 10-step workflow is 60% overall. A 20-step workflow is 36%. AI agents trade one problem (brittle selectors) for another (probabilistic failure).

A Different Approach: Health Contracts + Deterministic Programs

What if your automation had a contract that defined what "healthy" looks like?

// Built into every tap program
health: {
  min_rows: 5,          // must return at least 5 results
  non_empty: ["title"]   // "title" field must never be empty
}

Now instead of silently returning empty arrays, the system knows when something is wrong.

$ tap doctor
hackernews/hot    ✔ ok     30 rows  (245ms)
google/trends     ✘ fail   0 rows   min_rows: expected ≥5, got 0
github/trending   ✔ ok     25 rows  (1.2s)
reddit/hot        ✔ ok     25 rows  (890ms)
bbc/news          ✘ fail   3 rows   min_rows: expected ≥5, got 3

Two failures caught. Before your data went bad. Before anyone complained.

Watch: Real-Time Change Detection

Health checks catch breakage. But what about legitimate changes?

"I built Site Spy after missing a visa appointment slot because a government page changed and I didn't notice for two weeks."
— vkuprin, Hacker News

$ tap watch hackernews hot --every 10m
2026-04-04T10:00  +added   "Show HN: Tap"  score=342
2026-04-04T10:10  +added   "Rust 2.0 announced"  score=128
2026-04-04T10:10  -removed "Old post fell off"  score=12

tap watch runs your program on an interval, diffs the results, and outputs only what changed. Pipe it to a file, Slack webhook, or another program. Unix philosophy — no database, no service, just while + sleep + diff.

The Self-Healing Loop

Put it all together:

# 1. Forge once — AI writes a deterministic program
$ tap forge "scrape Hacker News top stories"
✔ Saved: hackernews/hot.tap.js

# 2. Run forever at $0
$ tap hackernews hot
30 rows (245ms) Cost: $0.00

# 3. Watch for changes
$ tap watch hackernews hot --every 1h

# 4. Daily health check
$ tap doctor --schedule "0 6 * * *"

# 5. Auto-heal when something breaks
$ tap doctor --auto

The loop: forge → run → watch → doctor → heal → run. You sleep. Tap doesn't.

Why This Works

The key insight: AI should run at authoring time, not at runtime.

Forge uses AI once to write a deterministic .tap.js program
Run executes the program with zero AI — $0 per execution, 100% deterministic
Doctor detects breakage via health contracts — no AI needed
Heal re-invokes AI only when the site actually changes

99% of runs need zero AI. You only pay for intelligence when the world changes.

Programs Beat Prompts — why AI should write code, not run it
Your Automation Costs $1 Per Run. Mine Costs $0. — the cost math
Websites Change. Your Automation Shouldn't Stop. — the self-healing loop

Try it now

# Install (macOS / Linux)
curl -fsSL https://taprun.dev/install.sh | sh

# Run your first tap
tap update && tap hackernews hot

# Check health of all your automations
tap doctor

Getting started guide · GitHub · 195+ community taps included