AI agents are increasingly capable — and increasingly unsupervised. ChooChoo's governance layer gives you visibility into what your agents are doing, control over what they're allowed to do, and accountability for what they've done.
The governance model rests on three pillars:
- Visibility — Every agent action is traced, timestamped, and appended to an immutable audit log. Nothing happens silently.
- Control — Agents declare their boundaries in
AGENTS.md. The Engine enforces them at validation time — before changes reach production. - Accountability — Every change is attributed to a specific agent, conversation, and model. Risk-scored. Approved or blocked. Auditable.
Preview: Governance features are not yet available in the public CLI. They will ship as part of The Station (evaluation dashboard). Check the Roadmap for current status.
Today: You can enforce agent boundaries (
AGENTS.md+choochoo validate), record every AI-assisted change as a trace, and use risk-scored output in CI. Full governance features (approval workflows, audit trail, The Station) ship with the Phase 2 release.
choochoo.toml — The Policy Source of Truth
All governance configuration lives in choochoo.toml. This is what makes ChooChoo Governance-as-Code: policies are versioned in your repository, reviewed like source code, and enforced automatically — no manual configuration drift, no out-of-band rules.
[governance]
# Which agent actions require human review
require_review = ["contracts/**", "auth/**", "pii-tagged-schemas/**"]
[governance.risk]
auto_approve_threshold = 3.0
require_approval_above = 7.0
[governance.approval]
notify_channel = "slack:#ai-governance"
blocking = true
timeout_minutes = 60
The [governance] block drives risk scoring, approval workflows, and audit configuration. Agent role definitions — what specific agents are allowed to do — are declared here and flow downstream to AGENTS.md context generation.
Audit Trail
Every agent action — file edits, schema validations, context compilations — is recorded in an immutable audit log. The audit trail links each action to the agent that performed it, the conversation that motivated it, and the trace that captured it.
Trace records are content-addressed and append-only: each record is hashed, and new records reference the hash of the previous one, making the log tamper-evident without an external notary.
A trace record in the audit log looks like this:
{
"trace_id": "tr_01HX4K2NQ7BVMZ3FP8CDYE6R9W",
"agent_id": "code-review-agent",
"action": "file.edit",
"artifact": "src/auth/session.py",
"conversation_id": "conv_01HX4K2NQ7BVMZ3FP8CDYE6R9V",
"model": "claude-sonnet-4-6",
"timestamp": "2025-06-14T09:12:33Z",
"prev_hash": "sha256:3b4c1d…",
"hash": "sha256:9f2a7e…"
}
All trace data flows from Agent Trace.
Risk Scoring
ChooChoo calculates a risk score (0.0–10.0) for every proposed change. The scoring formula is:
risk = (1 - confidence) × scope_factor × history_penalty
Where:
- confidence — the
metadata.confidencefield from the trace (0.0–1.0) - scope_factor — a multiplier based on how many files and schema types are affected (1.0–3.0)
- history_penalty — increases if the agent has a poor track record on similar change types (1.0–2.0)
A change with confidence 0.6, touching 4 files across 2 schema types (scope 1.8), from an agent with a clean history (penalty 1.0) scores: (1 - 0.6) × 1.8 × 1.0 = 0.72 — low risk, auto-approved.
Configure scoring thresholds in choochoo.toml:
[governance.risk]
auto_approve_threshold = 3.0
require_approval_above = 7.0
Changes below auto_approve_threshold are merged automatically. Changes above require_approval_above are blocked until a human approves.
Approval Workflows
Policy gates that require human sign-off before a change lands. Approval policies are defined in choochoo.toml and triggered when risk scores exceed a threshold or when changes touch sensitive schema fields.
Example policy config:
[governance.approval]
notify_channel = "slack:#ai-governance"
blocking = true # CI gate holds until approved
timeout_minutes = 60 # auto-reject if no response
ChooChoo supports two operating modes:
- Async (notify) — Slack or email notification is sent; the pipeline continues. The approval is recorded post-hoc for audit purposes.
- Blocking (CI gate) — The pipeline pauses at the approval gate. The process exits with exit code 10 until a human approves via The Station or CLI. CI systems should treat exit code 10 as "pending", not "failed".
Lineage Graph
A queryable graph of all artifacts, agents, and decisions linked across time. The graph nodes are artifacts (files, schemas, reports) and agents; edges represent relationships:
produced_by— this artifact was created or modified by this agent actiondepends_on— this artifact references or imports anothervalidated_by— this artifact passed a schema validation step
Query the graph from the CLI:
$ choochoo lineage show src/auth/session.py
src/auth/session.py
├── produced_by: code-review-agent (tr_01HX4K2N…) — 2025-06-14
├── depends_on:
│ ├── src/auth/models.py
│ └── src/config/settings.py
└── validated_by: schema-validator (ODCS v2.1) — 2025-06-14
Impact analysis works in reverse: choochoo lineage impact src/auth/models.py shows all downstream artifacts that would be affected by a change.
Compliance Reporting
Generate proof-of-compliance artifacts for auditors. Supported frameworks:
- EU AI Act — transparency, traceability, and human oversight requirements
- GDPR — data subject rights, PII handling, and processing records
- SOC 2 — availability, integrity, and confidentiality controls
- ISO 27001 — information security management evidence
Generate a report from the CLI:
$ choochoo report generate --framework eu-ai-act --from 2025-01-01 --to 2025-06-30
The output is an evidence package containing:
- Agent System Card inventory (all registered agents and their declared boundaries)
- Filtered audit log (agent actions within the reporting period)
- Risk score distribution (histogram + flagged high-risk changes)
- Approval workflow log (all approval requests, outcomes, and timestamps)
- Lineage graph snapshot (artifact provenance for the period)
Compliance reports are filtered by the compliance frameworks declared in each agent's System Card.
The Station
The Station is the governance and evaluation web UI.
Available in preview:
- Agent activity audit search (filter by agent, action type, time range)
- Risk score heatmap across recent changes
- Lineage graph visualization (interactive artifact + agent graph)
- Benchmark quality scores from trace evaluation runs
Planned (not yet released):
- Approval workflow inbox (approve/reject from the UI)
- Compliance report generation UI
- RBAC — role-based access control for The Station itself
- SSO — enterprise identity provider integration
- Context Graph — full cross-agent context visualization
Follow the Roadmap for launch updates.
How Governance Connects to Traces
Governance is not a separate layer — it is a downstream consumer of the trace data produced by every agent action. The data flow is:
flowchart LR
A[Agent Action] --> B[Trace Emitted]
B --> C[Risk Scored]
C --> D{Score > threshold?}
D -- No --> E[Auto-approved]
D -- Yes --> F[Approval Gate]
F --> G[Human Review]
G --> H[Approved / Rejected]
E --> I[Audit Log Appended]
H --> I
Each step is traceable: the trace record references the approval decision, and the approval decision references the trace. The audit log is the terminal sink — everything lands there, whether auto-approved or human-reviewed.
Related
- Agents — Declare agent capabilities and boundaries.
- Agent Trace — Traces feed the audit trail and risk scoring.
- Fleet Visibility (Preview) — Observability across all AI agents in your organization.
- Roadmap — Timeline for governance feature releases.