How we shipped MCP account switching in one session

Yesterday I shipped a small Octopad feature called Connected MCP clients. It's a panel in user settings that lets you see and revoke the OAuth tokens your AI assistants (Claude Code, Cursor, OpenClaw, ChatGPT) use to talk to Octopad. Click "Sign out" on a row, and the next time that assistant connects you can re-authenticate as a different Octopad account. Useful for users with multiple identities (personal + work) who want to switch without re-installing the MCP connector entirely.

The whole arc (from "I'm annoyed I can't switch users" to "shipped to production with a written spec, an 8-task implementation plan, three rounds of adversarial review, nine committed code changes, and five follow-up tasks filed for next time") happened in one session of a few hours. This post is about how that session worked, because the shape of it is more interesting than the feature itself.

One useful thing also happened at the end: when I live-tested the shipped feature against my actual MCP client (OpenClaw on my VM), it surfaced a behavior in Claude CLI that the MCP spec doesn't describe. I come back to it in the live-test section below.

What Octopad is, briefly

Octopad is a shared workspace where AI assistants and humans collaborate on the same projects: tasks with structured metadata, knowledge captured as durable facts and decisions, pages for long-form reference docs. Every AI session starts by loading the workspace's methodology and current state, then writes back what it accomplished at the end. The AI assistant doesn't lose context between sessions because the work itself lives in the workspace, not in chat history. That's the substrate this story plays out on.

Research first, design second

I started the session by stating the pain: "I can't switch which Octopad account my AI assistant is logged in as without deleting the MCP connector entirely." From there I worked through a structured intake with Claude before any solution-proposing: state the problem, look at how others have solved similar ones, propose two or three approaches, pick one, write a spec.

Octopad's role at this point wasn't to orchestrate the brainstorm. It was to ground it. When the session started, Octopad had already loaded the relevant context via build_context: the product overview, the MCP-tool specs, and the related pages the assistant would need to reason about this problem. The brainstorming ran in Claude. The grounding it ran on was Octopad's.

Claude ran parallel web searches first. Within a few minutes I had concrete data:

Notion exposes user-facing controls under each agent's Tools & Access settings to remove and re-authenticate MCP connections, plus a workspace-level Disconnect All Users button for admins.
A community-maintained Linear MCP project (lmcp, by GitHub user bleugreen) ships CLI subcommands like auth switch to swap between authenticated workspaces. Linear's own MCP server delegates auth to the connecting client and doesn't include an account-switching surface.
OpenAI Codex has an open environment-isolation issue (#14330) that covers MCP account-switching as one part of a broader gap (separate auth, config, and MCP-server state per project/organization).
The MCP spec itself is silent on token revocation. RFC 7009 (OAuth Token Revocation) isn't in its list of required OAuth specs, and the spec defines no client behavior for revoking tokens, switching identities, or surfacing re-auth to users. Whatever your MCP server does about revoke, it's doing on its own.

That research was the input to the design, not a side activity.

We picked Notion's web-settings pattern as the primary surface because it works out-of-band from the chat session. You can revoke your token from your browser at any time, even if your MCP client is busy, stuck, or restarting. We also logged an in-chat MCP logout tool (for users who want to drive the revoke from inside the chat instead of the web UI) as a complementary surface, filed as a follow-up task; it shipped a few sessions later. Two surfaces, not one: web for the primary path, in-chat for users already mid-session who'd rather not switch tabs.

The reasoning behind that choice (the comparison to other patterns, the alternatives we filed as follow-ups, the sources) didn't stay in the chat conversation. Claude wrote it back into Octopad as durable workspace artifacts: the comparison was written into the spec document; the deferred alternative was filed as a follow-up task with its own Why and What; the parent task's description recorded the rationale with linked sources. Three weeks from now, when someone asks "wait, why didn't we ship the in-chat tool first?", they don't need to track me down or scroll a chat transcript. They ask their AI assistant (Claude, Cursor, whichever one they're using), and it pulls the spec page and the linked tasks straight out of Octopad and gives the answer with sources. The chat itself is ephemeral; the artifacts the assistant writes back into Octopad during the session are the durable record.

A spec, written before any code

The brainstorm ends in a spec: a markdown document covering architecture, data model, server actions, UI, error handling, edge cases, security, testing. Not a sketch. A finished doc, around 400 lines, saved as an Octopad page in the Engineering folder. Future tasks against this feature link straight to the spec, so anyone picking up related work later (a follow-up, a bug fix, an extension) finds the design rationale first and doesn't have to grep the codebase to reconstruct it.

Implementation, with adversarial review on the way in and out

The spec converted into an eight-task plan: migration, RLS tests, server-side last_used_at debouncing, a forensics log breadcrumb, server actions, UI, integration verification, close-out. Each task carries its own Why, What, acceptance criterion, and dependency wiring. The plan was the task tree; I didn't write it down separately.

The migration (the column grant and policies underpinning the whole feature) got the full subagent treatment: a fresh AI agent in its own context, plus a two-stage reviewer pair behind it (spec compliance, then code quality). The other seven tasks I did directly. Either way, the work opened with the spec, the related tasks, and the parent's rationale already loaded by Octopad. No "here's what we're doing, the spec is over there, let me catch you up." Just the work, with the context already in place.

Pre-merge, three parallel reviewers read the committed code from three different angles. The most useful catch: the revoke-all-clients action had no explicit user_id filter. It relied entirely on RLS for scoping. If RLS ever regressed, the query would mass-revoke every user's tokens in the system. A one-line fix; an embarrassing miss. The reviewer who caught it was reading the action with the spec page and the prior auth-table migration both in front of it. A reviewer working out of an unrelated PR window probably wouldn't have made the lateral connection. Each of the three catches became its own filed follow-up task, not a "we should also…" in a code comment.

The live test

I deployed to production, then live-tested with my actual OpenClaw setup. The web UI worked perfectly. The token revoke worked. The next message in OpenClaw should have triggered fresh OAuth.

It didn't.

The MCP spec says clients should recover from a 401 by re-running the OAuth flow. Claude CLI (the MCP client inside OpenClaw) was clearing its cached access token on the 401 and stopping. No re-auth prompt, no fresh login URL. The user was stuck.

The workaround was mechanical once I knew where to look: delete the cached credentials entry, restart the container, the next message triggers fresh OAuth properly.

What worked, what I'd do again, what I'd change

Worked. Doing research before design (concrete data about Notion, Linear, Codex) actively shaped the design instead of bolting onto a pre-made one. Pre-merge adversarial review caught structural problems before they shipped. Per-task scoping (subagent-driven for the load-bearing migration, direct execution for the smaller follow-ups) kept the working context clean across all eight tasks. Filing follow-ups as referenced tasks (not as prose buried in comments) kept the task graph honest.

Would do again. Three parallel adversarial reviewers, not one. Each one read the code from a different angle.

The marginal cost is small, and the angle diversity is substantial.

Would change. The Claude CLI bug should have been caught before deploying, not after. The fix is structural: file the live-test step as a tracked subtask in Octopad, with the parent feature unable to close until that subtask is done. Then forgetting the test isn't possible. As it was, I kept it as a mental to-do and shipped before I got to it.

What stays

A few hours of active time produced: a working production feature, a 400-line spec, an 8-task implementation plan, nine committed code changes, a manual integration verification, five filed follow-up tasks, two documentation pages, and this blog post. All of those artifacts (not the chat transcript that produced them, but the structured artifacts the assistant wrote back during the session) live in the workspace where the work happened. They are searchable, with the full chain of reasoning intact, ready for the next session or the next teammate to pick up exactly where this one left off.

A few hours, and the next person to touch this feature doesn't need me in the room. The feature itself is mostly the worked example.