Roadmap
CellState Roadmap
Last updated: March 12, 2026 — v0.5.5
CellState is a Rust runtime and API server that gives AI agents persistent, hierarchical memory with forensic event tracking. PostgreSQL 18 is the brain. Every state change flows through a deterministic mutation pipeline and produces a tamper-evident cryptographic receipt.
This roadmap describes where the project is, where it’s going, and what each stage means. For the versioning philosophy behind these stages, see VERSIONING.md.
Where We Are: v0.5.5
v0.5.5 shipped March 12, 2026 — CLI full-fidelity (25 entity commands), core type extraction, Convex SDK hardening, embedding dimension fixes.
What’s live right now:
- 7-crate Rust workspace, compiling to a single binary
- 4-stage mutation pipeline (Assemble → Gates → Commit → Receipt) with 11 gate checks
- Immutable Event DAG with Blake3 hash chains and UUIDv7 causal ordering
- Full entity hierarchy: Tenant → Agent → Trajectory → Scope → Turn, plus Artifacts and Notes
- 57 PostgreSQL tables, 85+ REST endpoints
- Basic MCP server
- WebSocket event broadcast and SSE streaming
- LMDB cache layer for sub-millisecond reads
- 18 background jobs
- 2,001 passing tests
- 48 Prometheus metrics, OTLP observability
- Live in production on bare metal (Linode Newark + Cloudflare Tunnel)
v0.5.5 is a buffer release for ad-hoc optimization, sanity checks, and foundation work before v0.6. No fixed scope — it ships when the foundations feel solid.
What was built in v0.5.3–v0.5.5 (in progress)
18 PRs wiring up every typed-but-unimplemented behavior in the codebase:
- MCP: Prompts capability, logging capability, completion/autocomplete, vector-backed note search
- A2A: Full task state machine (submit → working → completed/failed/canceled), real SSE streaming with single-fetch polling
- Conflict resolution: ContradictionGate dispatches 4 strategies (LastWriteWins, HighestConfidence, Escalate, None)
- Artifact promotion: Child trajectory artifacts auto-promote to parent scope on outcome report; delegation completion triggers promotion
- Summarization chains: Auto-triggering of L0→L1→L2 policies with threshold evaluation and pipeline routing
- Protocol surfaces: AG-UI SSE endpoint, A2UI mutations + subscriptions with tenant isolation and bootstrap snapshots
- Module isolation: Protocol layer enforced to not import from routes
- HNSW indexes: Uncommented and wired with V57 migration
- OpenAPI: 143 of 196 endpoints annotated and registered
- Wire contract tests: 8 JSON fixtures, TypeScript + Python SDK test expansion
- Benchmarks: Criterion context assembly benchmarks, k6 load test skeleton
- Lock contention tests: Concurrent acquire, TTL expiry, 10-way contention
- MCP integration test: End-to-end initialize → tools/list → tools/call
v0.6.0 — “It Works”
Theme: Every typed behavior is wired. The API does what it claims.
This is the shipment of the v0.5.3–v0.5.5 work as a tagged, verified release. Philosophy: “use what we have before we bolt shit on.”
When this ships:
- Full MCP protocol coverage (prompts, logging, completion, vector search)
- A2A task lifecycle functional end-to-end
- All 4 conflict resolution strategies operational
- AG-UI and A2UI protocol surfaces functional server-side
- Summarization chains auto-trigger across abstraction levels
- Artifact promotion works across trajectory hierarchy
- OpenAPI spec covers core endpoints
- TypeScript and Python SDK wire contracts validated
What v0.6 does NOT ship: test coverage for the full DB layer, production error quality, documentation, API freeze.
v0.7.0 — “Prove It” (Test Coverage)
Theme: Build the regression net. You can’t safely refactor 641 error-handling sites without tests catching regressions.
A codebase audit revealed: 78% of DB modules have zero unit tests, 65% of route handlers have zero or trivial tests, and 13 of 22 background jobs have minimal coverage. The happy paths work; the error paths are unproven.
What ships:
- Auth + tenant DB tests — api_key authenticate/rotate/revoke, tenant member lifecycle (security-critical)
- A2A + coordination lifecycle tests — full state machine with real DB, lock contention, delegation lifecycle
- Agent + BDI persistence tests — belief/goal/plan storage, checkpoint save/load
- Infrastructure DB tests — working set, pack config, summarization, deployment, tool execution (75 functions)
- Route handler tests — search, config, models, event DAG, summarization CRUD (15 zero-coverage endpoints)
- Job error path tests — OAuth refresh failure, tenant lifecycle, MCP error scenarios
- Load test baselines — k6 authenticated CRUD + context assembly, 50 VUs 5 min, committed baselines, CI gate (p95 regression > 20% = fail)
When this ships, every critical code path has at least one test, and performance regressions are automatically caught.
v0.8.0 — “Harden It” (Error Quality)
Theme: Stop lying to operators. Replace every generic error message with context that helps debug production. Safe to do because v0.7 tests catch regressions.
A codebase audit revealed: 371 ApiError::internal_error calls, 86% with generic messages like “Entity deserialization failed” (repeated 68 times in one file). 100+ production unwrap() calls that will panic. 170 silently swallowed errors in the storage layer with zero logging.
The error architecture is solid (40-type ErrorCode enum, proper response shape, production redaction). The content is garbage.
What ships:
- generic.rs contextualization — 70 identical error messages replaced with entity type + field context
- driver.rs error quality — 43 generic errors + 41 silent
let _ =patterns replaced with structured messages and debug logging - Storage layer visibility — 19 silent delete/evict failures now logged (cache corruption signals, eviction contention)
- Production unwrap elimination — zero
unwrap()calls in non-test server code - CHANGELOG.md — backfilled from v0.5.0 through v0.8.0
When this ships, a production error log entry tells you what went wrong, not just that something did.
v0.9.0 — “Document It” (API Surface)
Theme: The API is stable, tested, and well-errored. Now make it discoverable.
A codebase audit revealed: 2,960 public items across all crates, 8.5% with doc comments. 53 HTTP endpoints missing from the OpenAPI spec. The Rust SDK README is 49 lines.
What ships:
- OpenAPI 100% coverage — all 196 endpoints annotated and registered
- Core types documentation — every
pub struct,pub enum,pub traitin cellstate-core has///comments - Server public API documentation — route handlers, pipeline stages, middleware, request/response types (target: 40%+ from 8.5%)
- SDK documentation — Rust SDK comprehensive README, Python/TypeScript API reference sections
- A2A SSE push via LISTEN/NOTIFY — replace 1-second polling with DB-native push (the one optimization, because documenting a polling API as the contract is setting a bad precedent)
When this ships, someone reading the docs can use CellState without reading the source.
v1.0.0 — “The Contract”
1.0 is not a marketing event. It’s a promise: this API will not break under you without a major version bump.
Requirements:
- Public API surface frozen — every path, type, error code in a versioned OpenAPI spec committed to the repo. CI check: any spec drift = fail.
- Schema migration CLI —
cellstate migrate status,cellstate migrate up,cellstate migrate validate. Operators can upgrade without hoping. - External validation — at least one team outside core running CellState for 30+ days
- SDK reference docs — TypeScript, Python, Rust all published with getting-started examples
What we decided NOT to do before 1.0:
- Crate decomposition (module isolation is enforced; separate crates are premature until boundaries prove stable)
- MCP JSON-RPC transport (HTTP POST works; JSON-RPC is a v1.x addition)
- MCP SSE transport (same reasoning)
- Hosted service (post-1.0)
After 1.0:
- Releases are milestone-based, not calendar-based
- Patches ship as needed (same-day for security)
- Minor versions ship when a coherent capability set lands
- Major versions only when breaking changes are genuinely necessary
- Deprecated items survive at least two minor versions before removal
Beyond 1.0: Directions, Not Promises
These are areas of interest, not commitments. They’ll become concrete milestones when the foundation is stable enough to build on.
- CellState Hosted — managed service where the runtime lives on owned infrastructure and agent shells deploy to edge. The Webflow-for-agents model.
- Crate decomposition — if v0.9 module boundaries held, split
cellstate-serverinto protocol-aligned crates (cellstate-mcp,cellstate-a2a,cellstate-agui). If not, document why and defer. - Pack Editor as CellState agent — the configuration tool itself runs on CellState, using its own working sets and event DAGs. Full dogfooding.
cellstate-rscrate — published Rust crate for agent developers who want typed state machines, event DAGs, and mutation pipelines without running the full server.- Ecosystem integrations — first-class connectors for major agent frameworks, LLM providers, and orchestration platforms.
How to Follow Along
- Changelog: See CHANGELOG.md for the structured record of every change
- Releases: GitHub Releases include human-written notes explaining what each version means
- Versioning philosophy: See VERSIONING.md for the full framework behind these stages