← Tap · Blog

How Reddit Demand Kit Turned a $0.15 LLM Call Into a $0 Compiled Pipe

April 11, 2026 · Leon Ting · 13 min read · 217× cheaper per run, 10× faster, deterministic output

Reddit Demand Kit (RDK) is a Model Context Protocol server that helps you find validated pain points on Reddit. You ask Claude or Cursor "scan r/sysadmin for unmet tool needs", the MCP agent calls RDK's tools, RDK fetches Reddit data, and the agent produces a report.

A typical invocation of the market_scan prompt burns about 26,000 LLM tokens. On Claude 3.5 Sonnet pricing ($3 input / $15 output per million), that's ~$0.15 per call. A heavy user running this 30 times a day pays $135/month in LLM costs alone — 15× what they pay RDK for a Pro subscription.

Every single call re-runs the same orchestration logic: call reddit_subreddit hot, call it new, pick promising posts, deep-dive each with reddit_post, categorize, score, write a markdown report. The structure of that analysis is identical every time. Only the raw Reddit data changes.

This post is about what happens when you stop paying the LLM to re-invent the wheel every morning, and compile the wheel once.

Background: prompts are programs waiting to be compiled

When Tap's composition DSL shipped earlier today, I realized that RDK was sitting on a huge latent optimization. RDK's architecture follows a common MCP pattern: tools fetch raw data, prompts guide the AI through analysis. Code does plumbing. Prompts do thinking.

That pattern is right when you don't have another option. The moment you do have another option — a way to express "thinking" as a deterministic data structure instead of an English prompt — the pattern is wrong. Prompts become expensive mailing envelopes for the same letter, over and over.

With Tap's new tap.pipe() DSL, RDK's six analysis prompts become six candidates for compilation. Each one is a multi-step Reddit fetch followed by a categorization + filter + sort + limit. The LLM does not need to plan the steps each time — the steps are deterministic and the plan never changes. Only the data changes. That is literally what a compiled program is for.

The target: rdk/market-scan.pipe.js

Here's the market_scan prompt's job, in English, from src/prompts/market-scan.ts:

"Scan r/{subreddit} for market opportunities and pain points. Use reddit_subreddit with sort=hot to see trending; reddit_subreddit with sort=new to see what's being posted now; reddit_post to deep-dive the most promising posts. Categorize each post (pain point, tool request, competitor discussion, success story, general). Identify recurring pain, unmet needs, tool fatigue, budget signals. Output: Community Profile, Top Pain Points, Tool Requests, Market Opportunities, Recommended Next Steps."

421 words of English. The LLM reads this, plans a call sequence, fetches data, reasons about each post, produces a markdown report. Token cost: high. Consistency: variable. Time: 25-60 seconds per call.

The compiled version is 106 lines of JavaScript in a .tap.js file that Tap's executor runs directly:

/** @type {import('tap-core/types.d.ts').TapModule} */
export default {
  site: "rdk",
  name: "market-scan",
  intent: "read",
  description: "Scan a subreddit for pain points — compiled, zero AI cost per run.",
  requires: [
    "reddit/hot",
    "reddit/pain-points",
    "reddit/sub-intel",
    "tap/filter",
    "tap/sort",
    "tap/limit",
  ],
  columns: [
    { name: "title",    type: "string", required: true },
    { name: "score",    type: "number" },
    { name: "comments", type: "number" },
    { name: "url",      type: "string", required: true },
    { name: "type",     type: "string", required: true },
  ],
  args: {
    subreddit: { type: "string", required: true },
    limit:     { type: "number", default: 10 },
  },

  async tap(handle, args) {
    return handle.pipe({
      steps: [
        // Three parallel reddit fetches — DAG lets them run at once.
        { id: "intel", run: ["reddit", "sub-intel"],
          args: { subreddits: "$args.subreddit" } },
        { id: "pain",  run: ["reddit", "pain-points"],
          args: { subreddit: "$args.subreddit", sort: "hot", limit: 25 } },
        { id: "hot",   run: ["reddit", "hot"],
          args: { subreddit: "$args.subreddit", limit: 10 } },

        // Keep only type=pain_point.
        { id: "pain_only", run: ["tap", "filter"],
          args: { rows: "$pain.rows", field: "type", eq: "pain_point" } },

        // Sort by score descending.
        { id: "ranked", run: ["tap", "sort"],
          args: { rows: "$pain_only.rows", field: "score", order: "desc" } },

        // Take top N.
        { id: "top", run: ["tap", "limit"],
          args: { rows: "$ranked.rows", n: "$args.limit" } },
      ],
      return: {
        community:   "$intel.rows",
        pain_points: "$top.rows",
        trending:    "$hot.rows",
        pain_count:  "$pain_only.count",
      },
    });
  },
};

This is not a prompt. It's a data structure. The steps array is a literal DAG. The $ref strings like $pain.rows bind to upstream step outputs. handle.pipe() schedules the DAG, runs independent steps in parallel, wires outputs to inputs, filters cycles, and returns the shape named by return.

No LLM. Not at runtime, not in the loop, not anywhere. Zero tokens per invocation.

The measurement: 26,000 tokens vs 0 tokens

Let me be specific about where the prompt-based version's 26,000 tokens go. Every MCP tool call starts with the full system prompt plus every tool definition plus every prompt definition. For RDK, that's about 3,500 tokens before the user's message is even parsed. Then the market_scan prompt body itself is another 600 tokens. Then the ReAct loop begins:

TurnPurposeInput tokensOutput tokens
1System prompt + tool defs + market_scan prompt + user query~4,150~800
2+ turn 1 output + reddit_subreddit(hot) result~5,950~700
3+ turn 2 output + reddit_subreddit(new) result~7,450~1,200
4-8+ 5 reddit_post deep-dives + analysis turns~25,000 (cumulative)~3,500 (cumulative)
Total~19,950~6,000

At Claude 3.5 Sonnet pricing:

The compiled version, by contrast, spends its tokens once during the forge step (about 8k input + 3k output to produce the pipe code) and then zero per invocation. Break-even is 0.46 calls — you save money before the first real usage even completes.

Usage tierPrompt mode / monthPipe mode / monthUser saves
Light (3 calls/day)$13.50~$0.07 (one-time forge)$13.43
Medium (10 calls/day)$45.00~$0.07$44.93
Heavy (30 calls/day)$135.00~$0.07$134.93
Enterprise (100 calls/day)$450.00~$0.07$449.93

A heavy user of RDK currently spends fifteen times what they pay RDK for Pro ($9) on Claude API tokens. That money flows to Anthropic, not to RDK. Compiling the prompt into a pipe reroutes that spend: the user saves $125/month, and RDK can charge $19/month for a "Compiled" tier and still be a huge win.

The architecture refactor, end to end

Adding one compiled pipe is easy. The harder move is architecture: make the whole RDK codebase benefit from Tap's composition layer instead of each tool re-implementing Reddit scraping. RDK used to have a src/tap-bridge.ts file of 200 lines that manually called tap.nav + tap.eval over the daemon HTTP bridge, scraping Reddit JSON by hand. Meanwhile ~/.tap/taps/reddit/ already contained 22 forged, fingerprinted, doctor-monitored Reddit skills that did the same thing, centrally maintained.

The refactor replaces tap-bridge.ts's homemade scraping with thin wrappers over the tap CLI subprocess. Here is the entire new tapRun helper that every function in the file now uses:

async function tapRun(
  site: string,
  name: string,
  args: Record<string, unknown>,
): Promise<unknown[]> {
  const cliArgs: string[] = [site, name, "--json"];
  for (const [key, value] of Object.entries(args)) {
    if (value === undefined || value === null || value === "") continue;
    cliArgs.push(`--${key}`, String(value));
  }
  const cmd = new Deno.Command("tap", {
    args: cliArgs, stdout: "piped", stderr: "piped",
  });
  const { code, stdout, stderr } = await cmd.output();
  if (code !== 0) {
    throw new Error(`tap ${site}/${name}: ${new TextDecoder().decode(stderr).trim()}`);
  }
  const text = new TextDecoder().decode(stdout).trim();
  return text ? JSON.parse(text) : [];
}

That's it. 20 lines replace a 70-line daemon RPC round-tripping layer. Every RDK tool now calls tapRun("reddit", "search", {...}) or tapRun("reddit", "post", {...}) and gets back a plain JSON array of rows. When Reddit changes its HTML or API, Tap's doctor detects it, heal fixes the underlying reddit/* skill, and every RDK installation benefits without a release.

The catches I hit during the refactor

Three real problems came up that I want to document, because they're the kind of thing every migration hits and pretending otherwise would be dishonest.

1. The sandbox didn't know about handle.pipe

Tap runs community taps in a Deno Worker sandbox with a method allowlist. The allowlist had all 25 core + built-in operations but not pipe, because pipe is from today's DSL addition. First attempt to run the compiled market-scan pipe failed with operation 'pipe' is restricted in this context. The fix for now is running trusted RDK pipes with --no-sandbox; the proper fix is adding pipe to the sandbox allowlist, which requires routing the call through the Worker→main message bridge with access to the outer tap's resolved args. That's a follow-up.

2. The reddit/hot skill ignored the subreddit arg

The existing ~/.tap/taps/reddit/hot.tap.js hardcoded /r/popular/hot.json. It worked as a "global trending" tap, not as a per-subreddit tap. My pipe called it with subreddit: "$args.subreddit" expecting subreddit-specific hot posts and got Reddit's global front page back every time. Fixed by rewriting the tap to accept a subreddit arg, default popular for backwards compat. Five minutes. This is the kind of latent bug composition exposes — a leaf tap that was "fine" in isolation becomes obviously broken when someone tries to compose with it.

3. The reddit/post skill returned lossy shapes

The old reddit/post.tap.js returned {title, score, body, top_comments: "[23] text\n---\n[12] text..."} — comments as a joined string, no num_comments, no subreddit, no author. RDK's MCP tool needed structured comments with individual scores. Also broken on a real URL (TypeError on .data because of unchecked optional chain). Rewrote to return {title, score, num_comments, subreddit, author, body, comments: [{author, body, score, created_utc}]}. The rewrite is a strict superset — any prior consumer still sees its old fields.

Lesson: composition pressure-tests your leaf primitives in ways that isolated testing does not. Every leaf tap I'd used in a pipeline got a fix. This is a good thing. The leaves are better now.

The new rdk compile command

Shipping the pipe file is one thing; making it easy for users to invoke is another. RDK gained two new CLI commands:

rdk install-pipes              # Copy src/pipes/*.tap.js into ~/.tap/taps/rdk/
rdk compile market-scan --subreddit sysadmin --limit 5

install-pipes is a one-time setup. It copies every compiled pipe file into ~/.tap/taps/rdk/ so the system tap CLI can discover and run them. compile is a shortcut for tap rdk market-scan --json <args> with argument forwarding — it runs the pipe via subprocess and prints the structured JSON output to stdout.

An end-to-end real run against r/sysadmin finishes in about 6 seconds wall-clock: three parallel Reddit fetches (bottleneck), a filter, a sort, a limit, plus the tap CLI subprocess overhead. Compared to the prompt-based flow's 25-60 seconds, that's 5-10× faster even in the slowest case.

What's still prompt-based, and why

I migrated one prompt today: market_scan. The other five (demand_analysis, competitor_analysis, audience_discovery, posting_strategy, outreach_playbook) are still served as LLM prompts. Here's why:

The honest takeaway: compiled pipes replace the mechanical part of AI workflows, not the judgment part. For RDK, four of six prompts are substantially mechanical (data-gathering and rule-based scoring) and can be fully or mostly compiled. The remaining judgment-heavy workflows stay as prompts where AI's flexibility actually earns its cost.

The strategic shift: Compiled tier

Compiling one prompt is an engineering win. Compiling the right subset of prompts becomes a business repositioning. RDK's current tier structure is:

The compile refactor adds a natural third tier:

This is a rare upsell where both sides are strictly better off. The alternative — where the user keeps running the prompt version — leaks money to Anthropic instead of to RDK. The Compiled tier converts that leakage into shared value.

The pattern generalizes

Everything in this post applies to any LLM-wrapping product where the workflow is:

  1. Fetch data from one or more sources
  2. Apply deterministic filters / sorts / aggregations
  3. Produce a structured report

The fetches are cacheable. The filters are code. The report structure is fixed. The only variable is the input data. That is a program, not a prompt, and forcing users to pay for it on every invocation is leaving money on the table for Anthropic.

If your product looks like this — and many MCP servers built in 2025-26 do — your highest-leverage move is the same one RDK just made: identify the mechanical prompts, compile them into deterministic pipes, and charge for the savings. Tap's composition DSL is one way to do it; you can build your own executor if you prefer. What matters is recognizing that prompts are compiled programs waiting to exist.

The RDK compiled pipeline for market-scan is in the public RDK repository as of today's commit. The Tap composition DSL that makes it possible landed earlier today and is documented in Composable Taps Are Just JavaScript. The companion post on why we chose JS over YAML for Tap's pipeline format covers the design decisions in more detail.

One $0.15 call doesn't seem like much until you multiply it by every user and every day. Then it's real money — real money that, in RDK's case, just moved from Anthropic's ledger to the user's pocket and RDK's subscription revenue. That's what "programs beat prompts" actually looks like when the unit economics flip.


Taprun: your agent runs the browser task — you keep the audit trail

Tell your agent a browser task on any site that needs your login — it runs in your real, already-logged-in Chrome and compiles it once into a deterministic, auditable .plan.json program: a versioned, reviewable record of exactly what it did. Every replay after is local, zero tokens, same result every time. Cookies and sessions never leave your machine — by architecture, not policy. Cloud browser SDKs can't match this; they need your session in their database to function. tap verify catches substrate drift before your data goes stale. Works with Claude Code, Cursor, Cline, Windsurf, and any MCP host. 70+ community taps.

curl -fsSL https://taprun.dev/install.sh | sh

taprun.dev · GitHub · More posts

Follow new engineering notes: RSS · Watch on GitHub