← Tap · Blog

Notes from playwright-mcp #1530: named browser sessions need typed manifests

April 28, 2026 · Leon Ting · 6 min read · Why the auto-detection manifest is the load-bearing piece, not the session API

playwright-mcp issue #1530 proposes adding named, persistent BrowserContext sessions to the MCP server: session_create, session_save, session_clone, plus a manifest that auto-detects which services each saved session is logged into. The author shipped a working fork with ~3,000 lines across three core files. The maintainer pushed back, suggesting browser_storage_state primitives plus running multiple MCP servers for role isolation.

The pushback is technically fair. The interesting question — and the part the thread mostly skipped — is what happens to the manifest as it accretes consumers. Here are three engineering questions and one architectural observation, lifted from a comment I left on the issue and expanded for builders working in this space.

1. Service detection by cookie name/domain has a churn problem

The fork's service-detector.ts enumerates 50+ services. Cookie names and domains drift constantly:

When a manifest reports "logged into Google as user@x.com" and the detector silently misses an auth refresh, an agent proceeds against a session that's now anonymous. The bug is invisible at the Playwright layer; everything works, the page just isn't the page the agent thought it was.

A probe-based confirm step at session_load (one HEAD or GET to a known authenticated endpoint per service) catches this for $0.0001 per check and is the kind of thing you can't skip in production. It also gives you a natural way to retire matchers as services drift, instead of accumulating a 200-entry detector lookup that nobody trusts.

2. PID-based stale-lock detection breaks across containers

The fork's state-store.ts uses PID checks to detect stale locks: if the recorded PID no longer exists, the lock is claimable. This is correct on a single host. In a typical Spring AI or containerized deployment, ~/.playwright-sessions/ is often a host volume mounted into multiple containers, and each container has its own PID namespace.

Two containers can each see "PID 47 doesn't exist locally — stale, claim it" while the real owner is alive in the other container. A host-id (random UUID generated and persisted to the manifest at first claim) plus PID disambiguates this with very little extra code. Either fix it or be explicit in the README that the lock is host-scoped — otherwise teams hit corrupt-state bugs that take half a day to track down.

3. Auto-detection isn't an "agent issue" — it's a shared-cache problem

The maintainer's framing was: agents should know which services they're authenticated with, and that's the agent's responsibility, not the MCP server's. Technically correct: an agent could re-derive this on each session load.

Operationally, the framing is backwards. The moment you have N agents — Claude Code plus Cursor plus a CI runner — sharing one ~/.playwright-sessions/ directory, the manifest-side detection is a shared cache that prevents N redundant probes against live services and N divergent answers about the same bytes on disk. The information is a property of the saved session, not of any particular agent's runtime. Putting that cache in the MCP layer is the obvious home for it.

Storage-state primitives alone don't address the "which identity is this?" question, which is the actual reason teams reach for sessions in the first place. The maintainer's "use two MCP servers" workaround scales to two, not to N.

The deeper gap: sessions need typed, addressable manifests

The deepest thing #1530 exposes is that "session" today means "cookies + localStorage." Once teams have named persistent contexts, the next thing they want is typed, addressable manifests so two agents handing one off can verify they got the same auth scopes.

You don't have to sign anything on day one. But pinning a schema URL into manifest.json from the start saves a migration later — and gives the auto-detection table a stable contract for downstream tools that want to ingest your saved sessions without re-running detection. W3C Web Annotations are one off-the-shelf shape for this; JSON-LD contexts are another. Either is better than ad-hoc JSON, and the cost of picking one early is roughly zero compared to the cost of versioning an untyped manifest after three downstream consumers depend on it.

Why this matters beyond playwright-mcp

Every project doing browser automation for AI agents will hit the same wall as soon as they cross the line from "ephemeral context per call" to "persistent named session." The session API is the easy part — three weeks of work, mostly mutex bookkeeping. The manifest is the load-bearing part, and it's the one you get exactly one chance to design before you have to ship migrations.

If you're working in this space and want to compare notes, my email is in the footer. I read the fork because I've been working on the same shape from a different angle, and the cookie-detector churn problem in particular is one I'd like to see solved well in at least one canonical implementation.


Taprun: your agent runs the browser task — you keep the audit trail

Tell your agent a browser task on any site that needs your login — it runs in your real, already-logged-in Chrome and compiles it once into a deterministic, auditable .plan.json program: a versioned, reviewable record of exactly what it did. Every replay after is local, zero tokens, same result every time. Cookies and sessions never leave your machine — by architecture, not policy. Cloud browser SDKs can't match this; they need your session in their database to function. tap verify catches substrate drift before your data goes stale. Works with Claude Code, Cursor, Cline, Windsurf, and any MCP host. 70+ community taps.

curl -fsSL https://taprun.dev/install.sh | sh

taprun.dev · GitHub · More posts

Follow new engineering notes: RSS · Watch on GitHub