Release Audit v0.5.5
0.5.5 Release Audit Ledger
This pass focused on concrete public-release blockers found while auditing the recent 0.5.5 release train and the current release path.
Findings
Blocker, Fixed: release preflight treated Fly templates as release-critical input
- Evidence:
bash ./scripts/release/preflight.sh v0.5.5 --skip-lint --skip-openapi --skip-package-dry-runsoriginally failed on missing rootfly.api.toml. - Evidence: the same script used a stricter Python
__version__regex than the release workflow and failed against the currentpackages/python/cellstate/__init__.py. - Remediation:
scripts/release/preflight.shno longer treats Fly configs as release-critical, uses the same whitespace-tolerant Python version regex as release, and also validatespackages/convex/package.json.
Blocker, Fixed: installer and release artifacts were on different contracts
- Evidence:
scripts/install.shpointed atheyoub/cellstate, looked forcellstate-${version}-${target}assets, claimed universal platform support, and claimed SHA256 verification without any checksum publishing or verification. - Evidence:
.github/workflows/release.ymlonly publishedcellstate-server-linux-amd64-${tag}.tar.gz. - Remediation: the installer now normalizes
X.Y.ZandvX.Y.Z, points at the actual GitHub repo, targets the published Linuxamd64artifact, downloads and verifies a published.sha256, and fails clearly for unsupported platforms instead of pretending support exists. - Remediation: the release workflow now generates and uploads a SHA256 checksum beside the Linux binary and the release body now leads with the checksum-verified extraction flow instead of an unchecked
curl | tar.
High, Fixed: release workflow could report success while public SDK delivery failed
- Evidence: enabled
sdk-typescript,sdk-python,smoke-typescript, andsmoke-pythonjobs were markedcontinue-on-error: true. - Risk: a green release could still ship with broken npm/PyPI publication or broken consumer import paths.
- Remediation: removed
continue-on-errorfrom the enabled TypeScript/Python publish and smoke jobs so the release fails closed.
High, Fixed: published cellstate-pg image was never actually smoke-tested
- Evidence: release built and published
ghcr.io/.../cellstate-pg, butsmoke-dockerbootedpgvector/pgvector:pg18instead of the published PG image. - Risk: release could advertise a
cellstate-pgartifact that had never been exercised in the release pipeline. - Remediation:
smoke-dockernow depends onbuild-pg, pulls the publishedcellstate-pgtag, and uses it for the container smoke environment.
High, Fixed: CI did not compile or typecheck the Convex package
- Evidence:
make ci-tscovered root Bun checks, the TypeScript SDK build, and contract tests, but notpackages/convex. - Evidence: once
packages/convexwas typechecked directly,packages/convex/src/component/lib.tsfailed on uncheckedunknown[]filtering and stringly_idusage. - Remediation:
Makefilenow addscd packages/convex && bun run typecheck:all && bun run buildtoci-ts. - Remediation:
packages/convex/src/component/lib.tsnow uses explicit document-narrowing helpers and typed stored-document IDs so the existing logic typechecks without changing behavior.
Medium, Fixed: deployment docs blurred production reality with optional templates
- Evidence: deployment docs and checklists gave Fly example configs first-class treatment even though production is on bare-metal Linode.
- Remediation: docs and release language now state Linode bare metal is the current production path and treat Fly/Railway/Helm as examples unless explicitly used.
Medium, Fixed: Helm example publishing blocked the core release path
- Evidence: the release workflow required the Helm job before creating the GitHub release, even though Helm is an example deployment surface rather than the primary production path.
- Risk: an example chart failure could block a valid bare-metal release.
- Remediation: Helm remains publishable as an example artifact, but the core
releasejob no longer waits on it.
Medium, Fixed: final release job was redundantly re-uploading assets that earlier jobs already published
- Evidence:
build-binary,openapi, anddocs-bundleeach uploaded their own release assets, and the finalreleasejob then downloaded those artifacts and uploaded them again. - Risk: extra release coupling, wasted CI time, and more chances for asset/update races while adding no real verification.
- Remediation: the final
releasejob now only creates/updates release notes. Asset-producing jobs remain responsible for uploading their own artifacts.
Medium, Fixed: TypeScript CI did not wake up for SDK-pipeline changes outside package directories
- Evidence: the
docs-guardTypeScript change filter skippedscripts/generate-sdk.shand workflow changes, so a PR could alter the TS SDK pipeline while skippingmake ci-ts. - Risk: package and generator regressions could survive PR CI and only show up at tag time.
- Remediation: the TypeScript change filter now includes the SDK generation script and CI/release workflow files so
make ci-tsruns when the TS delivery path changes.
Low, Fixed: extension SQL CI was waking up on unrelated script churn
- Evidence: the
docs-guardPostgreSQL filter marked anyscripts/**change as a PG-extension change, which triggered the heavyweightextension-sqljob even for unrelated helper-script edits. - Risk: wasted CI time and noisier signals without improving extension confidence.
- Remediation: the PG-extension filter now keys off the actual extension/build surfaces instead of all scripts.
High, Fixed: the real Linode deploy script could skip brand-new migrations on first deploy
- Evidence:
examples/deploy/linode/deploy.shoriginally copied repo migrations into/opt/cellstate/migrationsonly after it had already computedMAX(version)and iterated the files to apply. - Risk: a release with a new SQL migration could deploy the new binary without applying the new migration until a second deploy or manual rerun.
- Remediation: the deploy script now syncs migrations from the current repo checkout before deciding what to apply, preserving forward-only/idempotent behavior on retry.
- Remediation: the same deploy script now verifies the published release tarball checksum before installation instead of piping an unchecked download straight into extraction.
High, Fixed: @cellstate/convex was public in npm terms but missing from the release contract
- Evidence: the repo has a versioned public package at
packages/convex, CI builds it, preflight checks its version and packability, and you intend to ship it publicly on npm. - Evidence:
.github/workflows/release.ymlpreviously published/smoked@cellstate/sdkandcellstate(Python), but not@cellstate/convex. - Risk: tag releases could appear green while the public Convex package lagged the tagged version, failed publication, or had broken install/import paths.
- Remediation: the release workflow now publishes
@cellstate/convexto npm, waits for a consumer install/import smoke test, and lists it in the release notes SDK section. - Remediation:
packages/convex/package.jsonnow points npm repository metadata atpackages/convexinstead of the nonexistent top-levelconvexdirectory.
Medium, Open: binary installer support is intentionally narrowed to Linux x86_64
- Evidence: release still publishes only a Linux
amd64binary tarball. - Remediation in this pass: installer now reflects reality and works for the published artifact instead of advertising unsupported targets.
- Follow-up if desired: add macOS/arm64/Windows release builds before widening installer claims again.
Medium, Validation Gap: Python package build could not be fully re-run in this sandbox
- Evidence:
python3 -m buildis installed locally, but package build failed because isolated build env setup needed to fetchhatchlingand network access is blocked here. - Evidence:
python3 -m build --no-isolationalso failed becausehatchlingis not installed in the local interpreter. - Impact: Python packaging is not marked broken, but it is not fully re-verified from this environment.
Local Validation Performed
bash -n scripts/install.sh scripts/release/preflight.shbash ./scripts/release/preflight.sh v0.5.5 --allow-dirty --skip-lint --skip-openapi --skip-package-dry-runscd packages/convex && bun run typecheck:allcd packages/convex && bun run buildbun run build:sdkbun test ./tests/contracts/npm pack --dry-runinpackages/typescriptnpm pack --dry-runinpackages/convex
Required Follow-up Before Tagging
- Run the Rust/DB-backed/security/live-API CI jobs for the final release commit.
- Re-run Python package build and
twine checkin a networked CI/release environment. - Decide explicitly which non-primary artifacts you want to keep publishing every release: Docker API image,
cellstate-pgimage, Helm chart example, docs bundle. - Decide whether the docs bundle should remain a blocking CI artifact or stay optional as it is now in the release notes path.