Derive the cleaning and normalisation spec from the raw data
Packaged know-how that tells an agent how to do a job well.
You might say…
“I applied cleaning rules in my head as I went and couldn't reproduce the dataset a week later — writing the spec first costs ten minutes and saves hours.”
What it does
Inspects raw client inputs and decides the full set of cleaning rules — dedup keys, format standardisations, unit conversions, cross-source reconciliation logic — producing an explicit, reviewable spec.
Trigger: Use after the inputs pass the completeness gate; provide the raw data overview and the methodology's working-data requirements.
I/O: Raw data overview + methodology requirements → cleaning spec (dedup keys, format rules, unit conversions, integrity checks)
Recognise the problem?
The primitives are the commodity part. The fastest next step is a conversation about composing them into something that works for you.
Start a conversation