{% include head-common.html %} {% include blog-styles.html %}
Integration · browser-use

Compile your browser-use AgentHistory into a deterministic replay

After your browser-use agent succeeds once, hand its AgentHistory to taprun.forge(). Tap compiles the navigation skeleton into a .tap.json plan that replays at zero LLM tokens — forever, until the page actually changes.

Why this exists

Every time your agent runs the same flow, it pays the LLM to re-derive the same DOM. The model doesn't remember last week's click sequence; it re-decides each step from scratch. For a multi-step authenticated flow that's the same shape on every run, that's wasted tokens.

"I'm the scheduler, the context manager, and the output parser all at once."
— jrswab, on running production browser agents
Public ops cost: $187/mo for one ongoing browser-agent workflow.
— helen_mireille, tracking agent ops in public

The compile-once / run-forever pattern is older than LLM agents. Tap just makes it the default for browser flows.

What v1 compiles deterministically

Be precise about scope so you know exactly what you're paying for and what you're not.

Navigation skeleton — always 0 LLM tokens on replay

YESgo_to_url — open a URL
YESclick_element — click by selector
YESinput_text — fill a field
YESscroll — scroll viewport
YESwait — pause for a condition
YESdone — terminal step

Extraction — depends on the destination

TIER 0Pages with RSS / Atom / JSON-LD / agents.json / OpenAPI: Tap's compiler emits a deterministic extraction. Replay is also 0 tokens.
NOT YETextract_content from the AgentHistory itself: browser-use's transcript doesn't capture per-step DOM, so semantic extraction can't be replayed deterministically from the trajectory alone. Run tap forge <url> on the destination to compile the extraction half.

Honest framing: navigation is the easy half; it's always free. Extraction is free only at Tier 0 sites. For a page without a Tier 0 source, the extraction step still costs LLM tokens; only the navigation skeleton saves. Most multi-step flows are dominated by navigation cost, so the saving is real even without Tier 0 — but don't believe a "$0/run" headline that doesn't name the split.

Token math, concretely

Numbers below are typical for a multi-step browser-use flow against a Tier 0 destination (e.g. a dashboard with a JSON API behind it). Your mileage varies; the directional shape doesn't.

RunApproachLLM tokensApprox cost
1stbrowser-use full LLM loop~14,000$0.042
1st (compile)forge(trajectory=...)~1,100$0.003
2nd → ∞browser-use re-runs~14,000 each$0.042 each
2nd → ∞tap run site/name0$0.000

100 reruns: browser-use path ≈ $4.20; Tap path ≈ $0.003. The crossover happens at run 2. Token figures assume Claude Sonnet pricing as of 2026-04; substitute your provider's rate if different.

Quickstart — Python

pip install taprun

Run your browser-use agent end-to-end at least once to get a working trajectory. Then:

from browser_use import Agent
from taprun import forge, run, doctor

# 1. Standard browser-use run — pays full LLM cost once.
agent = Agent(task="apply to Y Combinator", llm=...)
result = await agent.run()

# 2. Compile the navigation skeleton.
forge(
    trajectory=result.model_dump(),   # AgentHistoryList dict, JSON string, or path
    site="ycombinator",
    name="apply",
)
# writes ~/.tap/taps/ycombinator/apply.tap.json

# 3. Replay forever at 0 LLM tokens for navigation.
rows = run("ycombinator/apply")

# 4. Optional — drift detection without re-running the agent.
verdict = doctor("ycombinator/apply")  # 'ok' | 'broken' | 'stale'

Quickstart — TypeScript

npm install taprun
import { forge, run, doctor } from "taprun";

// after your browser-use TS agent finishes:
await forge({
  trajectory: agent.history,   // AgentHistoryList object, JSON string, or path
  site: "ycombinator",
  name: "apply",
});

const rows = await run("ycombinator/apply");
const verdict = await doctor("ycombinator/apply");

The SDK is a thin subprocess wrapper around the tap CLI binary. Each call spawns a process (~30–100 ms warm). For tight loops, run tap mcp start and call via MCP — the runtime stays resident and there's no per-call spawn cost.

Compiling the extraction half (Tier 0)

If your trajectory ends on a page with a stable data shape exposed via a Layer 1 source — Atom feed, JSON API, JSON-LD product page, agents.json manifest, OpenAPI doc — you can compile that half too:

# In a separate step, after the trajectory is forged:
tap forge https://example.com/dashboard
# inspects the destination, finds a Layer 1 source, emits a deterministic
# extraction at 0 LLM tokens. Saves to ~/.tap/taps/example/dashboard-data.tap.json

Compose the two with tap pipe or chain them in your code. For pages without a Layer 1 source, extraction falls back to the AI loop and stays priced; only navigation saves.

Drift detection — the part runtime LLMs can't do

A self-replaying agent doesn't know if the page silently changed underneath it. doctor independently fetches an authoritative source for the page (the same Tier 0 source the extraction uses, or a fingerprint baseline) and diffs against the plan's expected output. Three verdicts:

This is the layer that distinguishes Tap from a one-shot replay cache: the diff is computed against an authoritative source, not against the plan's own past output.

FAQ

Does this work for write actions (publishing, comments, transfers)?

Yes — forge respects browser-use's intent declarations. Write taps run under tap.run like reads, but they're skipped under tap.doctor (verifying a write would have side effects). Drift in write taps is caught by health-check failures on the read side instead.

What about Stagehand and Anthropic computer-use trajectories?

The trajectory schema in v1 is the browser-use AgentHistoryList shape. Stagehand and raw Anthropic tool-use trajectories are on the roadmap; the navigation primitives overlap, so adding them is mostly a schema mapping. Track issues for progress, or open one to vote.

Is the SDK open-source?

The taprun SDK packages on npm and PyPI are MIT-licensed thin clients. The tap CLI binary they invoke is proprietary. The Chrome extension runtime (github.com/LeonTing1010/tap) and the 65+ community taps in tap-skills are MIT.

What does it cost?

Free tier covers the deterministic compile path (Layer 1 sources), run, doctor, and 65+ community taps. Hacker $9/mo unlocks the full forge pipeline including the AI fallback for Layer 4 (DOM-only) pages, BYOK. Pro $29/mo adds heal (cached patches replay at 0 tokens), refresh, scheduling, and team server. 100% local mode is the default at every tier. Full pricing.

What if my page has no Tier 0 source?

Run tap forge <url> anyway. The forge pipeline falls back through Layer 2 (SSR) → Layer 3 (XHR/fetch) → Layer 4 (DOM). The deeper the layer, the more LLM tokens the compile costs one time, but replay is still deterministic. Layer 4 compile costs real tokens; Layers 1–3 are mostly free.

How do I know my replay didn't silently break?

Schedule tap doctor on a daily cron, or call it inline before consuming run() output in production. Doctor cross-validates against an authoritative source — it catches drift without you noticing or your downstream consumer noticing.

Next


Last updated 2026-04-26 against taprun v0.14 (Python) / v0.1.x (TypeScript). Trajectory compile (forge(trajectory=...)) shipped in commit e46c9df; the API is alpha and stabilizes at 0.2 once integration feedback lands.