Organise

Entity Resolution & Record Linkage

One identity per real-world entity across every source system.

Runs matching logic over incoming records to decide whether they describe something you already know about. When they do, it merges them into a golden record under a canonical id. When they don't, it mints a new entity. The result is a single authoritative version of each person, organisation, product, or place — regardless of how many source systems have their own id for it.

Shape

source systemsCRMcore bankcardsmatch?similaritymatchnewunsuregoldenrecordscanonicalidreview queueapproverapproved → merge

Operational dimensions

Requires approval

Each output waits on a human decision.

Event-triggered

Fires when an upstream condition occurs.

High data gravity

Owns a system-of-record; expensive to migrate.

Two-way integration

Reads from and writes to external systems.

Inputs

  • incoming records from one or more source systems
  • match rules or trained matching model
  • existing canonical entity table with golden records
  • blocking keys / similarity thresholds

Outputs

  • canonical entity id per incoming record
  • golden record per entity (merged attributes + provenance)
  • match decisions with confidence and evidence
  • unresolved / low-confidence cases for human review

Mechanism

Decides which incoming records refer to the same real-world entity, merges them under a canonical identifier, and maintains the linkage as new records arrive.

Why this is a primitive

Cannot be decomposed: the match-merge-maintain-canonical-id loop over arriving records is a single operation. It does not author a schema (vocabulary-authoring), apply a classification (classification-application), or build edges to other entities (graph-instantiation). It answers exactly one question: 'are these the same thing, and if so, what's the canonical id?'

Where it shows up

Financial services — resolves customer records across CRM, core banking, and card systems so the 360-degree customer view isn't triple-counting the same person
Healthcare provider — links patient records across registration, billing, and EHR to eliminate duplicate patient files and surface cross-system clinical history
B2B SaaS — deduplicates company and contact records arriving from web forms, enrichment APIs, and CRM imports before feeding the pipeline
Retailer — resolves supplier records across procurement, finance, and logistics systems so a single supplier isn't held as three distinct vendors

Related primitives

Tags

structured-dataAIbatchdata-qualityautonomous

See where it fits.

Primitives are configured into named solution shapes for each client’s domain. The fastest next step is a conversation about which shape fits your problem.

Start a conversation