Find your way around

Corpus Retrieval

Find the answer in your own corpus.

When someone on the team needs to know what your knowledge base actually says, Corpus Retrieval locates the most relevant passages and hands them back with provenance. No guessing, no hallucination — just what you have, ranked by relevance to the question asked. Reach for it any time the question is 'does our content cover this?' or 'where did we say that?'

Shape

Operational dimensions

No human in loop

Runs without a person in the path.

On demand

Fires when a user asks.

Medium data gravity

Holds working state that compounds over runs.

Read-only inbound

Consumes external data; does not write back.

Inputs

free-text query string
corpus (documents, passages, embeddings index)
optional access-permission filters
optional metadata constraints (date range, source type)

Outputs

ranked list of matching passages or documents
provenance metadata per hit (source, section, page)
relevance scores
optional citation set for downstream answer generation

Mechanism

Matches a free-text query against a corpus and returns the most relevant passages, documents, or citations from that corpus.

Why this is a primitive

Cannot be decomposed — the query → match → return-from-corpus operation is one indivisible act of locating-by-relevance. Presentation variants (hit-list of links vs single citation passage vs grounded answer with citations) are different rendering shells over the same retrieval operation; they do not constitute separate primitives because removing the rendering leaves the primitive intact, while removing the retrieval leaves nothing.

Where it shows up

Legal team locating relevant precedent in a case archive — surfaces the three most relevant case notes with citations in under a second

Customer support agent answering a product question — retrieves the exact policy paragraph from the knowledge base and surfaces it in the agent UI

Researcher querying an internal literature corpus — returns ranked paper abstracts with links to the full documents

Technical writer checking whether the docs already cover an edge case — finds matching sections across multiple guide files

Related primitives

Composes with

Faceted Filtering & Browse →Cross-Source Synthesis →In-Source Navigation →Editorial Surfacing →Descriptive Aggregation →Long-Form Prose Authoring →

Alternative to

Faceted Filtering & Browse →