Corpus Retrieval
Find the answer in your own corpus.
When someone on the team needs to know what your knowledge base actually says, Corpus Retrieval locates the most relevant passages and hands them back with provenance. No guessing, no hallucination — just what you have, ranked by relevance to the question asked. Reach for it any time the question is 'does our content cover this?' or 'where did we say that?'
Shape
Operational dimensions
Runs without a person in the path.
Fires when a user asks.
Holds working state that compounds over runs.
Consumes external data; does not write back.
Inputs
- free-text query string
- corpus (documents, passages, embeddings index)
- optional access-permission filters
- optional metadata constraints (date range, source type)
Outputs
- ranked list of matching passages or documents
- provenance metadata per hit (source, section, page)
- relevance scores
- optional citation set for downstream answer generation
Mechanism
Matches a free-text query against a corpus and returns the most relevant passages, documents, or citations from that corpus.
Why this is a primitive
Cannot be decomposed — the query → match → return-from-corpus operation is one indivisible act of locating-by-relevance. Presentation variants (hit-list of links vs single citation passage vs grounded answer with citations) are different rendering shells over the same retrieval operation; they do not constitute separate primitives because removing the rendering leaves the primitive intact, while removing the retrieval leaves nothing.
Where it shows up
Related primitives
Tags
See where it fits.
Primitives are configured into named solution shapes for each client’s domain. The fastest next step is a conversation about which shape fits your problem.
Start a conversation