Why integrate retrieval with orchestration?
An LLM agent that retrieves information from a knowledge base before answering can ground its responses in your documents rather than relying on training data alone. But retrieval alone is not enough – you also need to decide when to retrieve, how much context to include, and where to store what the agent learns. These are orchestration concerns.
securecontext handles document ingestion, embedding, vector search, and token-aware context assembly. orchestr defines agent workflows as directed graphs, manages state between nodes, and routes execution. Together, they let you build RAG agents where retrieval is an explicit graph node with its own inputs, outputs, and tests.
The sections below connect the two packages using the memory adapter pattern and graph-based retrieve-then-generate workflows.
For securecontext basics, see vignette("securecontext")
and vignette("retrieval-workflows"). For orchestr basics,
see vignette("quickstart", package = "orchestr").
Building a knowledge base
Start by creating documents, chunking them, and loading them into a
retriever. This is the same pipeline from
vignette("retrieval-workflows"), repeated here for
completeness. Everything runs locally; no external API calls
required.
library(securecontext)
library(orchestr)
library(ellmer)
# Create documents from your corpus
docs <- list(
document("R provides extensive facilities for statistical computing
and graphics. Linear models, time-series analysis, classification, and
clustering are all available out of the box.",
metadata = list(source = "r-intro", topic = "statistics")),
document("The tidyverse is a collection of R packages for data science.
Core packages include ggplot2, dplyr, tidyr, readr, and purrr.",
metadata = list(source = "r-intro", topic = "tidyverse")),
document("Shiny is an R package for building interactive web applications.
It combines R's analytical power with modern web UI components.",
metadata = list(source = "r-intro", topic = "shiny"))
)
# Build embedder from the corpus
corpus <- vapply(docs, function(d) d@text, character(1))
embedder <- embed_tfidf(corpus)
# Create vector store and retriever
vs <- vector_store$new(dims = embedder@dims)
ret <- retriever(vs, embedder)
add_documents(ret, docs, chunk_strategy = "sentence")The as_orchestr_memory() adapter
orchestr agents expect a memory backend with get() and
set() methods. securecontext’s knowledge_store
provides persistent key-value storage backed by JSONL, but its interface
does not match what orchestr expects out of the box. The
as_orchestr_memory() function bridges this gap.
The adapter pattern is simple: wrap a securecontext object in a thin layer that exposes the interface another package expects. The underlying storage, persistence, and search capabilities of the knowledge store are fully preserved; the adapter just translates method calls.
# Create a persistent knowledge store
ks <- knowledge_store$new(path = "agent-memory.jsonl")
# Wrap it for orchestr
mem <- as_orchestr_memory(ks)
# The adapter exposes get/set -- the same interface orchestr expects
mem$set("user.name", "Alice")
mem$set("session.topic", "data analysis")
mem$get("user.name")
#> [1] "Alice"The underlying JSONL file persists across R sessions, so agent memory
survives restarts. You can also access the knowledge store directly (via
ks) to use features like $search() and
$list() that are not part of the orchestr memory
interface.
Retrieval-in-the-loop graph
The core RAG pattern is: retrieve relevant context before each LLM
call. With orchestr’s graph_builder(), you wire this as a
two-node graph: a retrieval node followed by an agent node.
The following diagram shows the flow:
+----------+ +---------+ +-----+
| retrieve | ---> | agent | ---> | END |
+----------+ +---------+ +-----+
| |
Searches vector Uses retrieved
store, builds context to
token-limited answer query
context string via LLM
The retrieval node runs the securecontext pipeline (retrieve + context build). The agent node takes that context and passes it to an LLM alongside the user’s question. By separating these concerns into distinct graph nodes, each step is independently testable and replaceable.
# Node 1: retrieve relevant chunks and build context
retrieve_node <- function(state, config) {
query <- state$messages[[length(state$messages)]]
# Retrieve and assemble token-limited context
result <- context_for_chat(ret, query, max_tokens = 2000, k = 5)
list(context = result$context)
}
# Node 2: LLM agent that uses the retrieved context
agent_node <- function(state, config) {
context <- state$context %||% ""
query <- state$messages[[length(state$messages)]]
prompt <- paste0(
"Use the following context to answer the question.\n\n",
"Context:\n", context, "\n\n",
"Question: ", query
)
chat <- chat_anthropic(system_prompt = "You are a helpful R assistant.")
response <- chat$chat(prompt)
list(messages = list(response))
}
# Wire the graph: retrieve -> agent -> END
schema <- state_schema(
messages = "append:list",
context = "character"
)
graph <- graph_builder(state_schema = schema)
graph$add_node("retrieve", retrieve_node)
graph$add_node("agent", agent_node)
graph$add_edge("retrieve", "agent")
graph$add_edge("agent", END)
graph$set_entry_point("retrieve")
rag_graph <- graph$compile()
# Run it
result <- rag_graph$invoke(list(
messages = list("What packages are in the tidyverse?")
))Every user query first passes through the retrieval node, which searches the vector store and assembles a context string. The agent node then answers using that context. Because the context builder enforces a token budget, the agent node always receives a prompt that fits within the model’s context window.
Token budget management
When your knowledge base is large, retrieved chunks may exceed the
LLM’s context window. The context_builder() controls this
with a token budget and priorities. For a detailed treatment of priority
strategies and overflow behavior, see
vignette("context-building").
Here is a practical example using retrieval scores as priorities:
cb <- context_builder(max_tokens = 500)
# System prompt gets highest priority -- always included
cb <- cb_add(cb, "You are an R expert.", priority = 10, label = "system")
# Retrieved chunks get decreasing priority by relevance score
hits <- retrieve(ret, "statistical models", k = 5)
for (i in seq_len(nrow(hits))) {
chunk_text <- hits$id[i] # or look up the original text
cb <- cb_add(cb, chunk_text, priority = hits$score[i], label = hits$id[i])
}
result <- cb_build(cb)
cat("Included:", paste(result$included, collapse = ", "), "\n")
cat("Excluded:", paste(result$excluded, collapse = ", "), "\n")
cat("Total tokens:", result$total_tokens, "\n")The builder packs items in priority order until the budget is
exhausted. Dropped items are reported in $excluded, so you
can log what was cut.
Use cb_reset() between turns to reuse the same
builder:
cb <- cb_reset(cb)
# Now add fresh content for the next turnPersistent knowledge store
For agents that need memory across sessions,
knowledge_store persists to a JSONL file. Combined with the
orchestr memory adapter, this gives agents durable recall: the agent
remembers user preferences, past queries, and learned facts across
restarts.
# Knowledge store persists to disk
ks <- knowledge_store$new(path = "long-term-memory.jsonl")
# Store structured facts
ks$set("user.preference", list(language = "R", theme = "dark"))
ks$set("session.2025-01-15", list(
topic = "regression models",
outcome = "built linear model for mtcars"
))
# Search by key pattern
ks$search("^session")
#> $`session.2025-01-15`
#> $`session.2025-01-15`$topic
#> [1] "regression models"
# Use in an orchestr agent via the adapter
mem <- as_orchestr_memory(ks)Putting it all together
Here is a complete RAG agent that combines retrieval, token management, and persistent memory in an orchestr graph. Documents are ingested into a local vector store, queries trigger retrieval and context assembly, the LLM answers using grounded context, and the agent persists what it learns for future sessions.
library(securecontext)
library(orchestr)
library(ellmer)
# --- Knowledge base ---
docs <- list(
document("dplyr provides a grammar of data manipulation with verbs
like filter, select, mutate, summarise, and arrange."),
document("ggplot2 implements the grammar of graphics. Build plots
layer by layer with aes(), geom_point(), geom_line(), and facet_wrap()."),
document("tidyr helps tidy data with pivot_longer, pivot_wider,
separate, and unite.")
)
corpus <- vapply(docs, function(d) d@text, character(1))
embedder <- embed_tfidf(corpus)
vs <- vector_store$new(dims = embedder@dims)
ret <- retriever(vs, embedder)
add_documents(ret, docs, chunk_strategy = "sentence")
# --- Persistent memory ---
ks <- knowledge_store$new(path = tempfile(fileext = ".jsonl"))
mem <- as_orchestr_memory(ks)
# --- Graph nodes ---
retrieve_node <- function(state, config) {
query <- state$messages[[length(state$messages)]]
result <- context_for_chat(ret, query, max_tokens = 1500, k = 3)
list(context = result$context)
}
agent_node <- function(state, config) {
context <- state$context %||% ""
query <- state$messages[[length(state$messages)]]
prompt <- paste0("Context:\n", context, "\n\nQuestion: ", query)
chat <- chat_anthropic(
system_prompt = "Answer questions about R packages using the provided context."
)
response <- chat$chat(prompt)
# Persist what we learned
mem$set(paste0("query.", Sys.time()), query)
list(messages = list(response))
}
# --- Build and run ---
schema <- state_schema(messages = "append:list", context = "character")
g <- graph_builder(state_schema = schema)
g$add_node("retrieve", retrieve_node)
g$add_node("agent", agent_node)
g$add_edge("retrieve", "agent")
g$add_edge("agent", END)
g$set_entry_point("retrieve")
rag <- g$compile()
result <- rag$invoke(list(
messages = list("How do I reshape data from wide to long format?")
))The retrieval node finds chunks about tidyr’s
pivot_longer, the context builder fits them within the
token budget, and the agent answers using that grounded context. The
query is also persisted to the knowledge store for future reference.