Find your way around

Corpus Retrieval

Find the answer in your own corpus.

When someone on the team needs to know what your knowledge base actually says, Corpus Retrieval locates the most relevant passages and hands them back with provenance. No guessing, no hallucination — just what you have, ranked by relevance to the question asked. Reach for it any time the question is 'does our content cover this?' or 'where did we say that?'

Shape

free-text query“where did we say…?”match & rankrelevance scoringcorpus§1 · p.120.92§2 · p.190.81§3 · p.260.74ranked hits + provenancepull-by-relevance from your own corpus

Operational dimensions

No human in loop

Runs without a person in the path.

On demand

Fires when a user asks.

Medium data gravity

Holds working state that compounds over runs.

Read-only inbound

Consumes external data; does not write back.

Inputs

  • free-text query string
  • corpus (documents, passages, embeddings index)
  • optional access-permission filters
  • optional metadata constraints (date range, source type)

Outputs

  • ranked list of matching passages or documents
  • provenance metadata per hit (source, section, page)
  • relevance scores
  • optional citation set for downstream answer generation

Mechanism

Matches a free-text query against a corpus and returns the most relevant passages, documents, or citations from that corpus.

Why this is a primitive

Cannot be decomposed — the query → match → return-from-corpus operation is one indivisible act of locating-by-relevance. Presentation variants (hit-list of links vs single citation passage vs grounded answer with citations) are different rendering shells over the same retrieval operation; they do not constitute separate primitives because removing the rendering leaves the primitive intact, while removing the retrieval leaves nothing.

Where it shows up

Legal team locating relevant precedent in a case archive — surfaces the three most relevant case notes with citations in under a second
Customer support agent answering a product question — retrieves the exact policy paragraph from the knowledge base and surfaces it in the agent UI
Researcher querying an internal literature corpus — returns ranked paper abstracts with links to the full documents
Technical writer checking whether the docs already cover an edge case — finds matching sections across multiple guide files

Related primitives

Tags

AIsearchknowledge-baseRAGsemantic-search

See where it fits.

Primitives are configured into named solution shapes for each client’s domain. The fastest next step is a conversation about which shape fits your problem.

Start a conversation