Lineage Graph
Visualizing dependencies and impact
[!WARNING] Status: Planned. The lineage graph is part of the governance layer currently under development. The design below describes the planned behavior.
The Lineage Graph will map the relationships between all artifacts in the ChooChoo ecosystem. It answers the question: "If I change this, what breaks?" The graph is a core component of The Map and will power the Impact Radius calculation used in Risk Scoring.
Node Types
product: A Data Product defined using the ODPS standard.contract: A Data Contract defined using the ODCS standard.workflow: An Arazzo workflow that orchestrates multi-step operations.agent: An AI Agent registered in the Agent Registry.human: A developer or approver tracked in the Audit Trail.
Edge Types
| Edge | Description |
|---|---|
produces | A Product outputs data via a Contract. |
consumes | A Product reads data from another Product. |
depends-on | A Product relies on another Product's uptime. |
implements | An API implements a Contract. |
validates | A Workflow step validates against a Contract. |
These relationships are automatically extracted during choochoo validate by resolving cross-references between artifacts. For example, when a Product's outputPorts reference a Contract, ChooChoo creates a produces edge. See Validation Rules for details on the cross-reference resolution process.
Impact Analysis
Before a change is applied, ChooChoo traverses the graph downstream to calculate the Impact Radius. This value is one of five factors in the Risk Scoring algorithm.
Example:
If Product A changes its output contract, ChooChoo finds that Product B and Dashboard C consume that contract. The Impact Radius increases based on the number and criticality of these dependents. If any of the downstream artifacts contain fields with sensitive compliance tags (e.g., pii, financial), the risk score increases further.
A high Impact Radius can trigger approval workflows requiring human sign-off before the change proceeds. In CI/CD pipelines, this manifests as exit code 10 (APPROVAL_REQUIRED).
Querying the Graph
Use the choochoo lineage command to explore the graph from the command line:
# Show direct dependencies of an artifact
choochoo lineage show customer-360
# Show dependencies up to 3 levels deep
choochoo lineage show customer-360 --depth 3
# Output as JSON for scripting
choochoo lineage show customer-360 --jsonCircular dependencies (A → B → A) are detected during validation and produce error E005 (Circular dependency detected).
Graph in The Station
The Station provides an interactive visualization of the Lineage Graph, allowing GRC teams to:
- Explore entity relationships visually
- Click through to Audit Trail entries for any node
- View Risk Heatmaps overlaid on the graph
- Identify high-impact artifacts that affect many downstream consumers
- Filter by compliance tags to focus on regulated data flows
Building the Graph
The graph is built incrementally as artifacts are validated and Decision Traces are recorded. Each trace links an Agent or human actor to the artifacts they modified, creating actor-to-artifact edges alongside the artifact-to-artifact relationships defined in the specs.
For the graph to be accurate, your project structure must follow the standard layout and all cross-references between artifacts must resolve correctly. See File Structure for the expected directory layout and naming conventions.
Related
Risk Scoring
How the Impact Radius from the Lineage Graph feeds into risk calculations.
Products (ODPS)
Define data products with explicit input and output ports that create graph edges.
Contracts (ODCS)
Contracts are the binding agreements that connect products in the graph.
The Station
Explore the Lineage Graph interactively in the enterprise Governance UI.
Audit Trail
Every graph traversal and impact analysis is recorded for auditability.
Architecture
See how The Map (Lineage Graph) fits into the overall ChooChoo platform.