Every pipeline-as-code system in modern DevOps is YAML. GitHub Actions, CircleCI, GitLab CI, Kubernetes, Helm, Argo, Airflow (optionally), Tekton, Jenkins (declarative), Azure Pipelines, n8n workflows — the list has no exceptions. When we sat down to design Tap's composition layer — how one tap calls another tap and chains the results together — the industry answer was obvious. A new format, call it .pipe.yaml, with steps: and run: and args: keys.
I spent a day writing up that proposal. And then a user asked me the question I should have asked myself first: why not just use JavaScript?
This is the post I should have written before proposing YAML. It's the case against a YAML pipeline DSL for Tap, the case for JavaScript object literals, and the migration of a real Tap composite (bounty/match) that proves the approach works in production.
YAML advocates cite five benefits. Each one turns out to be false — or worse, achievable via a strictly better route — once you look at it.
The strongest argument for YAML is that it's not a programming language. A YAML file can't do arbitrary computation, so it can't contain bugs beyond the structural ones the schema validator catches. This is genuinely valuable when your pipeline is authored by marketers and PMs who don't code, like in Zapier or Make.
But look at what every mature YAML pipeline system has actually done over time:
v1 simple steps with names and run commands v2 add `if:` conditionals v3 add `$` expression interpolation v4 add `matrix:` for fan-out v5 add `parallel:` + `needs:` for DAG scheduling v6 add built-in functions like `fromJson()`, `contains()`, `hashFiles()` v7 add `continue-on-error:`, `retry:`, `timeout-minutes:` v8 add `outputs:` for cross-job value passing v9 add reusable workflows + composite actions + anchors
Congratulations, you invented JavaScript — with worse syntax, no debugger, no type checker, and no LSP. GitHub Actions' current expression language has this in its docs as real syntax:
if: $
This is a programming language. A bad one. The YAML escape hatch doesn't limit blast radius — it just forces every new feature to be expressed in a worse notation.
This one's just false. A JavaScript object literal diffs exactly as well as YAML:
- { id: "filter", run: ["tap", "filter"], args: { rows: "$hot.rows", gt: 10 } },
+ { id: "filter", run: ["tap", "filter"], args: { rows: "$hot.rows", gt: 50 } },
This is a perfectly readable, reviewable git diff. The YAML equivalent is one line of value: 10 → value: 50. Neither is meaningfully better than the other. And when the refactor is bigger — renaming a field across six steps, moving a step from position 2 to position 5, reorganizing the DAG — the YAML diff gets worse because whitespace-significant indentation doubles the noise.
This one is the most backwards. Training data:
.pipe.yaml DSL, which doesn't exist yet: zero tokens.Claude and GPT-5 generate JavaScript object literals with the same fluency they generate English. They generate custom YAML DSLs by cargo-culting the closest thing they've seen, which is usually wrong. If your goal is AI-authored pipelines, the format to pick is the one the LLM already knows in its bones — which for code is always going to be JavaScript.
JSON Schema can validate YAML at parse time, catching typos in field names and type mismatches before the pipeline runs. This is real — but TypeScript does strictly more.
A TypeScript interface catches the same typos, produces better error messages (with source location, type hover, and autocomplete), gets refactored across the codebase for free, and lets you compose types in ways JSON Schema can't (generics, unions, mapped types, conditional types). The tradeoff isn't "validation vs. no validation" — it's "schema validation vs. a much stronger type system that also validates."
A JavaScript object literal is also declarative. Watch:
const pipeline = {
steps: [
{ id: "hot", run: ["reddit", "search"], args: { subreddit: "webscraping", sort: "hot" } },
{ id: "top", run: ["tap", "filter"], args: { rows: "$hot.rows", field: "score", op: "gt", value: 100 } },
{ id: "sum", run: ["reddit", "insights"], args: { posts: "$top.rows" } },
],
return: "$sum.rows",
};
This is data, not code. It has no control flow. It has no side effects. It can be JSON.stringifyd, diffed, versioned, persisted, analyzed, and re-loaded. You can write a visualizer that renders it as a DAG without running it. An AI agent can generate, mutate, and heal it the same way it would a YAML file.
"Declarative" is a property of how you use the format, not of the format itself. JavaScript used as a DSL is declarative. YAML used as a programming language (see argument 1) is not. The format is the wrong axis to judge on.
Here's a real composite tap from Tap's skill library, before and after the refactor. This is bounty/match — a pipeline that fetches bounties from Algora, IssueHunt, and HackerOne, scores them by opportunity quality (competition, freshness, maintainer activity), and returns the top-N.
export default {
site: "bounty",
name: "match",
intent: "read",
description: "Composable bounty pipeline",
args: {
min: { type: "number", default: 100 },
platform: { type: "string", default: "" },
max_comments: { type: "number", default: 15 },
max_days: { type: "number", default: 60 },
limit: { type: "number", default: 10 },
},
async tap(tap, args) {
return await tap.run("bounty", "score", {
source_site: "bounty",
source_name: "all",
source_args: JSON.stringify({
min: args.min,
platform: args.platform,
}),
max_comments: args.max_comments,
max_days: args.max_days,
limit: args.limit,
});
},
};
This works, but notice what it's doing. bounty/match is passing the name of another tap (source_site: "bounty", source_name: "all") as a string argument to bounty/score, along with JSON-serialized inner args (source_args: JSON.stringify({...})). bounty/score then receives these strings, parses them, and calls tap.run(source_site, source_name, parsed_args) internally.
This is composition by convention — the entire pipeline DAG is hidden inside a JSON-string parameter. You cannot see from the file which sub-taps actually run. You cannot statically check that source_site: "bounty" and source_name: "all" correspond to an existing sub-tap. You cannot introspect or visualize the flow. If bounty/all gets renamed or removed, this breaks at runtime with a "tap not found" error that looks unrelated to bounty/match.
/// <reference path="../../../Documents/tap-core/types.d.ts" />
/** @type {import('../../../Documents/tap-core/types.d.ts').TapModule} */
export default {
site: "bounty",
name: "match",
intent: "read",
description: "Pipeline: bounty/all → bounty/score → top-N by opportunity score",
requires: ["bounty/all", "bounty/score"],
columns: [
{ name: "amount", type: "string", required: true },
{ name: "title", type: "string", required: true },
{ name: "platform", type: "string" },
{ name: "link", type: "string", required: true },
{ name: "score", type: "string", required: true },
{ name: "days_old", type: "string" },
{ name: "comments", type: "string" },
],
health: { min_rows: 1, non_empty: ["amount", "title", "link", "score"] },
args: {
min: { type: "number", default: 100 },
platform: { type: "string", default: "" },
max_comments: { type: "number", default: 15 },
max_days: { type: "number", default: 60 },
limit: { type: "number", default: 10 },
},
async tap(handle, args) {
return handle.pipe({
steps: [
{
id: "all",
run: ["bounty", "all"],
args: { min: "$args.min", platform: "$args.platform" },
},
{
id: "scored",
run: ["bounty", "score"],
args: {
rows: "$all.rows",
max_comments: "$args.max_comments",
max_days: "$args.max_days",
limit: "$args.limit",
},
},
],
return: "$scored.rows",
});
},
};
Eleven lines became forty (if you count the structured column metadata and the JSDoc reference path). The file got bigger, not smaller. And yet every structural property improved:
bounty/all runs first, then bounty/score receives its rows. That's in the file, readable top-to-bottom, no reverse engineering needed.requires[] is now declared. When someone runs capture (save) on a future version of this tap, Tap will verify that bounty/all and bounty/score both exist before writing the file. A typo would fail save, not silently pass until runtime.columns: ["amount", "title", ...] — a bare string array with no type info. The new form declares each column as { name, type, required }, so downstream pipes can validate their input contract at load time.@type {import(...)} — VS Code, Cursor, WebStorm, neovim with LSP — gets autocomplete on handle.pipe(...), on every field of the pipe object, on every sub-tap argument. Zero configuration, works out of the box.$args. The pipe doesn't need a closure over the outer args variable. References like $args.min get resolved by the executor at step launch time, which means the whole pipe is a plain data structure that survives JSON.stringify and can be analyzed without executing anything.$all.rows, the executor sees that both step 2 and step 3 have no dependency on each other and runs them in parallel. You don't have to think about it.github/trending with the same args, the run-scoped cache returns the same result to both, hitting GitHub exactly once.One thing the refactor required was updating bounty/score to accept its input as a direct rows[] array, instead of the old indirect source_site / source_name / source_args pattern. The new version accepts both, with rows preferred and the legacy path kept alive for backwards compatibility:
async tap(tap, args) {
let bounties;
if (Array.isArray(args.rows)) {
// New path: pre-fetched rows from tap.pipe composition
bounties = args.rows;
} else if (args.source_site && args.source_name) {
// Legacy path: fetch from an indirect source tap
const sourceArgs = JSON.parse(args.source_args || "{}");
bounties = await tap.run(args.source_site, args.source_name, sourceArgs);
} else {
throw new Error("bounty/score: pass rows[] or source_site + source_name");
}
// ... scoring logic unchanged ...
}
This is the right pattern for any composite-style sub-tap: accept rows[] as the preferred input path, keep the old indirect path working so existing callers don't break, and migrate callers one at a time. After every caller has moved, the legacy path can be removed in a future major version.
Everything above landed in a six-step implementation cycle that I'd recommend for anyone building their own composition layer:
columns: string[] with columns: ColumnSchema[]. Backwards compatible. This is the type foundation every subsequent step builds on.tap.pipe() executor (2 days). DAG analysis, topological sort, parallel scheduling, $ref resolution, clear error messages with step IDs. 180 lines of Deno TypeScript.requires: string[] on the tap module plus capture (save)-time validation that every listed sub-tap exists.types.d.ts (half day). A single declaration file covering Pipe, PipeStep, TapHandle, TapModule, ColumnSchema, and the rest of the DSL surface. Drift-detected against src/ by a test that compares field names via regex extraction.bounty/match as proof-of-concept. Test with mocked bounty/all and bounty/score to verify the pipeline runs end-to-end without hitting live APIs.Total: about five days of focused implementation, plus the test suite grew from 580 passing tests to 621 — 41 new tests across 5 new test files, all red-green-refactored. Zero regression. Every existing composite tap continues to work without modification.
I want to mention the things that are on most pipeline systems' feature lists but that I deliberately didn't build:
tap.parallel([...]) helper. tap.pipe already auto-parallelizes independent steps. A one-off parallel helper is a Promise.all one-liner any JavaScript author already knows how to write. Adding it as a named helper would just be API noise.tap.pipe in a retry loop is three lines of JavaScript and keeps the pipeline DSL minimal.if: exists, you need switch:, then while:, then expressions. JavaScript already has all of these. If you need conditional composition, write a JavaScript function that returns different pipe shapes based on its input.outputs: or needs: syntax.Pipe structure — doesn't need any special support from the executor.The guiding principle: redefine before extending. Tap's core has stayed at ~5,500 lines by asking "what should this thing be?" before adding features. The pipeline DSL grew by about 300 lines of executor plus a types file. Every feature I left out would have been a concession to YAML-pipeline muscle memory, not a real need.
The reason every pipeline system in 2026 is YAML isn't that YAML is the correct choice for pipelines. It's that when your target user is a non-developer — a sysadmin in 2006, a marketer in 2016 — YAML is the only format that maintains plausible deniability that the user "isn't coding." Once you accept that your target user is actually a developer running Claude Code or Cursor, the optimization flips. JavaScript isn't just acceptable — it's strictly better, because every affordance YAML advertises is something JavaScript already does better with ten more years of tooling.
If you're designing a composition layer for a developer-focused tool in 2026, default to JavaScript (or TypeScript). Default to plain object literals, no runtime builder pattern, no class hierarchy. Declare types via a single .d.ts file that ships with the tool so JSDoc @type imports light up every editor. Validate invariants with a type system that already exists. When a user asks why you didn't pick YAML, show them what they'd be giving up.
And if you already wrote a YAML pipeline DSL — consider whether the right move is to freeze it in maintenance mode and ship a JavaScript escape hatch that your power users can graduate to. That's the path where the feature pressure comes off the YAML layer and the genuinely complex workflows live in the one format that can handle them.
Tell your agent a browser task on any site that needs your login — it runs in your real, already-logged-in Chrome and compiles it once into a deterministic, auditable .plan.json program: a versioned, reviewable record of exactly what it did. Every replay after is local, zero tokens, same result every time. Cookies and sessions never leave your machine — by architecture, not policy. Cloud browser SDKs can't match this; they need your session in their database to function. tap verify catches substrate drift before your data goes stale. Works with Claude Code, Cursor, Cline, Windsurf, and any MCP host. 70+ community taps.
curl -fsSL https://taprun.dev/install.sh | sh
taprun.dev · GitHub · More posts
Follow new engineering notes: RSS · Watch on GitHub