How spec-driven development functions as a control surface for governed engineering - and the tooling that keeps it honest.
Spec discipline is a control surface. Not paperwork, not process theater - a mechanism that keeps the distance between declared intent and actual system state measurably small. In regulated work, that distance is not an abstraction. It is the difference between an audit that passes on its own evidence and one that requires reconstruction.
This post describes how Rethunk.Tech uses Spec-Driven Development (SDD) as an engineering discipline, the state machine that governs it, and the tooling built to keep the system honest when agents are doing the writing.
Every engineering team has some notion of a spec. The question is not whether specs exist - it is whether they stay current, whether they are the single authoritative source for a piece of work, and whether the transition from "spec approved" to "work done" is tracked in a way that does not require human memory to reconstruct.
In healthcare, financial services, defense, and critical infrastructure, that question has direct regulatory weight. When an examiner asks what was planned, what was built, and who made the decision to proceed, the answer has to come from records - not from whoever was in the room that day. Specs that drift from reality, or that exist as one of several competing copies, cannot answer that question.
SDD as Rethunk.Tech practices it treats the spec as the governing artifact of a unit of work. One spec. One authoritative status. One place where the decision to proceed is recorded. The audit trail follows from that structure.
The lifecycle of a spec follows a defined state machine:
DRAFT - the spec exists and is being written. Work has not started.
APPROVED - the spec has been reviewed and ratified. Work may begin. This transition is gated; an agent cannot claim work against a spec that has not been approved.
IN_PROGRESS - work is actively underway. The spec is owned by a named agent or team member. Tasks are being checked off against a per-spec tasks file.
BLOCKED or PARKED - work has stopped. BLOCKED means an external dependency is preventing progress. PARKED means the spec is valid but deprioritized. Both states are reversible. Neither is a silent drop.
DONE - the spec is closed. The directory moves from specs/active/ to specs/done/. The index is updated. No further edits are expected.
The canon-in-one-place rule governs the whole machine: there is exactly one spec file per unit of work, and its status header is the single authoritative record of where that work stands. No parallel tracking in a project management tool, no status row in a wiki that may or may not match the spec header. If the spec says IN_PROGRESS, the work is in progress. If the spec says DONE, the work is done - and the file is in specs/done/ to prove it.
Per-spec atomicity applies at two critical transitions. First, ratify before claim: a spec must reach APPROVED before any agent may claim it and start work. This prevents agents from beginning work against an unreviewed intent, which is the SDD equivalent of executing without authority. Second, close before merge: a spec must reach DONE before the corresponding branch is merged. The spec and the code artifact are coupled. One cannot outrun the other.
Without tooling that enforces atomicity, closing a spec manually is a sequence of roughly fifteen discrete edits:
spec.md from IN_PROGRESS to DONEtasks.md - final task checkboxesspecs/README.md or specs/index.md - row moved from the active table to the done table with a shipped dategit mv specs/active/<slug>/ specs/done/<slug>/ - file relocationEach step is correct when taken in isolation. The problem is the sequence. Over a working day, under normal interruptions, with multiple specs in flight, those fifteen steps are performed across multiple terminal sessions, multiple files, and sometimes multiple commits. Some get missed. The status header says DONE but the index still shows the spec in the active table. The directory moved but the index row was not updated. The tasks file has unchecked boxes that were actually completed.
The result is drift: the spec directory and its associated metadata represent different truths about the same unit of work. For a team of one or two, drift is recoverable - a human can scan the active directory and notice that the index is out of date. For an agent-assisted team where agents read the index to determine what to work on next, drift is a correctness problem. An agent may claim work against a spec that is already done, or skip work against a spec the index shows as done but is still open. Because agents execute at a speed that makes human correction reactive rather than proactive, the drift compounds before it is caught.
The fifteen-edit problem compounds as agent throughput scales. An agent that can close three specs in an hour can produce three sets of drift artifacts in an hour if the tooling does not enforce atomicity.
citadel-sdd is an MIT-licensed Model Context Protocol (MCP) server that implements the SDD lifecycle as a set of atomic tool calls. One spec_close call replaces all fifteen hand-edits. One spec_claim call handles the APPROVED to IN_PROGRESS transition, optional ratification, and commit - as a single operation that either fully succeeds or rolls back to the prior state.
The tool roster spans twenty tools: nineteen MCP tools plus one diagnostic. The tools cover every lifecycle event from DRAFT through DONE. Read tools handle listing, reading, status queries, and lint. Write composite tools handle the multi-step transitions (spec_claim, spec_close, spec_park, spec_block, spec_reopen, spec_unblock, spec_unpark). Write atomic tools handle narrower operations: ratifying a spec's Q-table, toggling individual task checkboxes, appending tasks, or reassigning ownership without a state transition. Infrastructure tools handle index rebuilds and fresh-repo bootstrapping.
The server ships with three profiles that follow an inheritance chain: default, bastion, and citadel. Each profile extends the prior with additional tooling appropriate to those contexts. A team working in a plain engineering repo uses the default profile. Bastion and Citadel operators use the profiles matched to their environment. Profile selection is configuration, not a fork.
Drift-proof invariants are enforced on every write: the spec's status header, its physical location in the directory tree, and its row in the index must be consistent after every tool call. If any write fails mid-operation, the tool restores the prior state. The index is never partially updated.
Citadel is one consumer of SDD discipline. The forge runs citadel-sdd against its own spec tree using the citadel profile, which carries the full tool set including additional invariants specific to Citadel's agent-identity and namespace-graph structure.
The discipline itself is Citadel-independent. citadel-sdd is a standalone MCP server that works in any repository with a specs/ directory tree following the expected layout. Nothing in the server assumes Citadel infrastructure, Citadel agent identity, or any Citadel-specific data model. The sdd_doctor tool inspects an existing repo, infers the best-match profile, and flags drift - it can onboard any repo that already has specs in flight, not just repos initialized from scratch.
Citadel core is proprietary software. citadel-sdd is MIT-licensed and publicly maintained at github.com/Rethunk-AI/citadel-sdd. The two are separable: adopting citadel-sdd does not require Citadel, and Citadel's governance properties do not depend exclusively on citadel-sdd. The SDD discipline is a layer that can be applied independently.
The reason Citadel uses it internally is the same reason it would apply in any agent-assisted repo: as agent throughput increases, the probability that any manual multi-step process will be executed correctly on every invocation approaches zero. Atomic tooling is not a convenience - it is a precondition for consistent behavior at agent scale.
Teams doing regulated work that have moved to agent-assisted development face a specific problem: the agents are fast, the specs are slow, and the gap between them is where correctness breaks down. SDD with atomic tooling closes that gap by making the spec lifecycle a first-class artifact that agents can update reliably, not an organizational convention that agents ignore because updating it manually is error-prone.
If your team is building in a context where the audit trail of what was planned and what was built is not optional - healthcare, financial services, federal contracting, critical infrastructure - SDD provides the structure. citadel-sdd provides the tooling to keep that structure honest under agent load. The discipline is not specific to any product or platform. It is a process invariant: ratify before claiming work, close before merging, and keep exactly one copy of the truth about where a unit of work stands.
The citadel-sdd repository includes installation documentation in HUMANS.md and docs/install.md. Agent integration documentation is in AGENTS.md. The server runs locally with no telemetry and no remote API calls.
Rethunk.Tech's open-source work, including citadel-sdd, is listed at /open-source. The Citadel product is described at /citadel.
An audit trail is only as valuable as its credibility under examination. The architecture behind hash-chained audit posture across Bastion and Citadel.
IRONLAW is the governance policy gate at the heart of Bastion. What each doctrine covers, how it maps to compliance questions, and where Citadel enforces the same principles at the development substrate.
Interested in working together?
We help teams ship governed AI operations - book a call to discuss your specific needs.
Was this page helpful?