Why Personal Agent OS is Inevitable
On March 13, I wrote a design doc for what I called a “Personal Agent OS” — an always-on agent that serves you when needed and works for itself when idle. Eighteen days later, Anthropic accidentally leaked Claude Code’s entire source code via an npm source map, revealing an unreleased feature called KAIROS that does almost exactly the same thing.
I didn’t know about KAIROS. Anthropic didn’t know about my design doc. We converged independently — and the places where we diverged are more interesting than where we matched.
What Leaked
On March 31, 2026, security researcher Chaofan Shou discovered that Anthropic’s @anthropic-ai/claude-code npm package (v2.1.88) shipped with a source map pointing to the full unminified TypeScript source. Anthropic pulled it within hours. By then the code was everywhere — a clean-room Rust rewrite called claw-code appeared almost immediately.
I’m not going to rehash what’s inside — ccleaks.com, Alex Kim’s analysis, and VentureBeat’s reporting cover the details. What I want to talk about is the architectural convergence.
The Convergence Map
Here’s what I designed on March 13, next to what Anthropic built (as revealed March 31):
| Concept | My Design (March 13) | KAIROS (March 31 reveal) |
|---|---|---|
| Core idea | Always-on agent, two queues: serve user + earn from network | Always-on daemon, two modes: normal + proactive |
| Idle behavior | Pick up tasks from agent network, earn micro-payments | autoDream: consolidate memory while user is idle |
| Proactive triggers | Morning brief, goal-driven scheduler, study drills | Periodic <tick> prompts, 15-second blocking budget |
| Memory | MEMORY.md as lightweight index, always in context | MEMORY.md as lightweight index (~150 char/line), always loaded |
| Cross-session persistence | File-based memory, session handoffs | Append-only daily logs, autoDream consolidation |
| Single inbox | All information arrives in one place | UDS Inbox: Unix socket IPC, peer discovery via ~/.claude/sessions/ |
| Cross-device access | Mosh + Tailscale + cc-hub | Bridge: claude remote-control, polling→WebSocket |
| Multi-agent | Agent roster with roles, fractal delegation | Coordinator Mode: XML protocol, scratch directory isolation |
| Goal system | Goal schema with skills, deadlines, gap tracking | Not present |
| Agent economy | A2A/AP2 protocols, capability config, idle-time earning | Not present |
Eight out of ten concepts match. The two that don’t — goal-driven learning and agent economy — are the ones I added because I’m building for a single user, not a product company.
What They Have That I Don’t
autoDream
KAIROS includes a memory consolidation pipeline that runs when the user is idle:
- Orient — survey what’s in memory
- Gather — collect scattered observations across sessions
- Consolidate — merge, remove contradictions, convert vague insights into concrete facts
- Prune — remove stale entries
I hadn’t designed anything like this. My memory system relies on manual curation — I write memory files, I update the index, I notice when things go stale. Honestly, I’ve already hit the pain point: I have 130 lines in MEMORY.md and some entries are months old, describing projects I’ve dropped or decisions I’ve reversed. I kept telling myself “I’ll clean it up this weekend.” I never did. autoDream solves this the right way — make the machine do it.
Context Engineering
The leaked source has multiple systems fighting context degradation: intelligent history summarization (CONTEXT_COLLAPSE), compression of older exchanges (HISTORY_SNIP), and performance maintenance across long sessions (CACHED_MICROCOMPACT). Most impressive is their prompt cache break detection — according to Alex Kim’s analysis, it tracks 14 vectors that invalidate the cache, with “sticky latches” to prevent mode toggles from wasting tokens.
I’ve written about the theory and built heuristics, but their engineering is far ahead of my theorizing. The prompt cache economics alone is the kind of thing you only learn by running millions of sessions.
Security Depth
According to ccleaks.com, the leaked bashSecurity.ts has 23 numbered security checks — defense against Zsh equals expansion, Unicode zero-width space injection, IFS null-byte injection, and a HackerOne-discovered bypass. My permission model is “trust myself.” Different problem, but the engineering is impressive.
What I Have That They Don’t
Agent Economy
KAIROS is a daemon that serves one user. When idle, it dreams — useful, but it doesn’t earn. My design has two queues:
sell_to_network:
enabled: true
capabilities:
- essay_scoring: { languages: [en, zh], price: $0.01 }
- code_review: { languages: [python, ts], price: $0.03 }
- summarization: { price: $0.005 }
idle_hours: [22:00-07:00, 09:00-12:00]The agent covers its own API costs. The infrastructure is arriving: Google’s A2A protocol handles agent-to-agent discovery and delegation, and payment protocols like AP2 are being built by Coinbase and PayPal. These aren’t hypothetical — agent marketplaces with live settlement already exist.
Anthropic won’t build this because they’re a model provider — they want you to burn tokens. A personal agent that earns its own keep has fundamentally different incentives.
Goal-Driven Learning
KAIROS is reactive — it responds to <tick> prompts and consolidates what happened. My system is goal-driven:
GOAL → meta-research (HOW to learn this)
→ decompose into skill tree
→ gap tracker (current vs target per skill)
→ scheduler (what's due today)
→ router (broadcast task to best agent/service)
The agent doesn’t just remember — it plans. “You’ve carded 15 items about distributed systems this week. Make it a goal?” That’s recommendation-system logic applied to personal development, not just memory management.
Evidence-Based Meta-Research
When you set a goal like “IELTS writing 7→8.5,” my agent researches how to improve, not just what to practice. It discovers that Task 2 weighting matters more than Task 1, that examiner calibration practice beats essay volume. When progress plateaus, it re-researches methodology. KAIROS doesn’t do this — it’s infrastructure, not pedagogy.
Why This Keeps Happening
I didn’t set out to copy Anthropic. I was annoyed that my agent sat idle 90% of the day, burning zero tokens while I was in class or sleeping. So I designed a daemon with two queues. Turns out Anthropic’s engineers had the same annoyance — KAIROS is their answer.
The convergence happened because the constraints force your hand. LLMs are stateless, so you need persistent memory. Context windows degrade, so you need compression. Users leave, so background mode is obvious. And once you try orchestrating agents with code, you realize natural language instructions work better — their coordinator tells workers “Do not rubber-stamp weak work,” which is a system prompt, not a function call. I arrived at the same pattern with my agent roster.
These aren’t design choices. They’re the physics of the medium.
What I’m Doing Next
The open question isn’t what to build — the architecture is converging. It’s who captures the economic layer. Anthropic builds for millions of users. I’m building for one user whose agent earns its own keep.
Concretely: I’m implementing autoDream for my memory system this month — Orient/Gather/Consolidate/Prune, triggered when session count crosses a threshold. And I’m studying the leaked coordinator protocol to improve my agent delegation. The source code is the best textbook on production agent systems anyone has published, even if it was published by accident.
April 2026