Data Asset Inventory & Lineage
Know what data you have, who owns it, and where it came from.
Scans your data estate — warehouses, lakes, APIs, models, dashboards — and maintains a live catalogue: schema, ownership, lineage, freshness, quality score, usage. When a table breaks, you know who owns it. When compliance asks where customer data flows, you trace lineage in minutes. Without it, this knowledge lives in heads and Slack threads.
Shape
Operational dimensions
Person oversees and intervenes by exception.
Fires on a clock.
Owns a system-of-record; expensive to migrate.
Consumes external data; does not write back.
Inputs
- connectors to source systems (warehouse, lake, APIs, BI tools)
- lineage signals (query logs, ETL job metadata, dbt manifests)
- steward annotations (owner, sensitivity classification, quality notes)
- schema change events
Outputs
- asset catalogue (dataset / table / column records + metadata + lineage edges)
- freshness and quality state per asset
- stewardship view (owner assignments, classification, deprecation status)
- searchable data catalogue interface
Mechanism
Maintains a catalogue of data assets (datasets, tables, columns, files, models, dashboards) with their metadata — schema, ownership, lineage, freshness, quality, usage.
Why this is a primitive
Kept separate from graph-instantiation despite the structural overlap (in principle, inventory IS graph-instantiation over an asset meta-schema) because the operation is dominated by a recurring asset-lifecycle: scan source systems, detect schema/lineage automatically, attach steward metadata, surface freshness/quality signals, deprecate. That scan-and-maintain loop is what's load-bearing, not arbitrary graph traversal. If we treated this as a special case of graph-instantiation we would lose the inventory-as-operational-discipline framing. CHALLENGE FLAG: defensible to delete and re-express data-catalogue compositions as `graph-instantiation + vocabulary-authoring`; kept because the metadata-lifecycle operation is reused by enough compositions (data catalogue, MDM steward views, model registry) that it earns its place.
Where it shows up
Related primitives
Tags
See where it fits.
Primitives are configured into named solution shapes for each client’s domain. The fastest next step is a conversation about which shape fits your problem.
Start a conversation