Notice what worked

Controlled Experimentation

Turn 'it seemed to work' into a causal claim.

Designs treatment-and-control experiments — A/B, multivariate, holdout, switchback — assigns subjects under a proper randomisation scheme, and delivers a statistically grounded ship/kill verdict. The output isn't a metric trend; it's a causal claim with a confidence interval you can defend to a sceptical stakeholder.

Shape

PopulationrandomiseControlTreatment ATreatment BCausaleffect + CIship · killnext hypothesis

Operational dimensions

Requires approval

Each output waits on a human decision.

On demand

Fires when a user asks.

Medium data gravity

Holds working state that compounds over runs.

Read-only inbound

Consumes external data; does not write back.

Inputs

  • hypothesis and treatment definitions
  • target metric and success criteria
  • subject population or traffic split
  • power analysis parameters (MDE, significance threshold, desired power)

Outputs

  • per-arm outcome distribution
  • treatment-effect estimate with confidence interval
  • ship / kill / extend decision input
  • experiment record for future retrieval

Mechanism

Assigns subjects to treatment and control conditions under an experimental design (A/B, multivariate, holdout, switchback), runs the experiment, and produces a causal readout of treatment effect with statistical confidence.

Why this is a primitive

Cannot be decomposed — the design → randomised-assignment → measure → causal-inference operation is one mathematical machinery (power analysis, randomisation, treatment-effect estimation, multiple-comparison handling). It is distinct from descriptive telemetry refinement: telemetry observes what happened, experimentation imposes the treatment-control structure that lets you claim WHY it happened. Strip the assignment-and-causal-inference layer and you have an A/B-flavoured dashboard with no causal claim.

Where it shows up

Growth team — A/B test of two onboarding flows on 14-day activation rate, with a statistically powered verdict on which flow to roll out
EdTech product — holdout experiment on a new adaptive routing algorithm, measuring whether it improves mastery rate versus the control population
Content team — multivariate test of email subject-line variants against open rate, with multiple-comparison correction to avoid false positives
ML platform — switchback experiment on a new ranking model in a marketplace, alternating treatment and control windows to handle temporal confounds

Related primitives

Tags

AIstructured-databatchdecision-supportexperimentationcausal-inference

See where it fits.

Primitives are configured into named solution shapes for each client’s domain. The fastest next step is a conversation about which shape fits your problem.

Start a conversation