[!CAUTION] Alpha software. This package is part of a broader effort by Ian Flores Siaca to develop proper AI infrastructure for the R ecosystem. It is under active development and should not be used in production until an official release is published. APIs may change without notice.
Memory, knowledge persistence, RAG retrieval, and context management for R LLM agents.
Why securecontext?
Most RAG solutions for LLM agents require sending your documents to external embedding APIs. securecontext takes a different approach: it builds local TF-IDF embeddings entirely in R, with no external API calls and no data leaving your machine. The package provides token-aware chunking that respects LLM context windows, splitting documents by sentence, paragraph, or recursively so chunks fit within token budgets. A built-in knowledge store with JSONL persistence lets agents retrieve relevant context across sessions without relying on third-party services. Everything runs locally, making it suitable for sensitive data, air-gapped environments, and workflows where data privacy matters.
Part of the secure-r-dev Ecosystem
securecontext is part of a 7-package ecosystem for building governed AI agents in R:
┌─────────────┐
│ securer │
└──────┬──────┘
┌────────────────┼───────────────────┐
│ │ │
┌──────▼──────┐ ┌─────▼──────┐ ┌──────────▼──────────┐
│ securetools │ │ secureguard│ │ >>> securecontext <<< │
└──────┬───────┘ └─────┬──────┘ └──────────┬──────────┘
└────────────────┼───────────────────┘
┌──────▼───────┐
│ orchestr │
└──────┬───────┘
┌────────────────┼─────────────────┐
│ │
┌──────▼──────┐ ┌──────▼──────┐
│ securetrace │ │ securebench │
└─────────────┘ └─────────────┘
securecontext provides the memory and retrieval layer for agents. It sits alongside securetools and secureguard in the middle tier, giving agents the ability to chunk documents, build TF-IDF embeddings locally, and retrieve relevant context for LLM prompts.
| Package | Role |
|---|---|
| securer | Sandboxed R execution with tool-call IPC |
| securetools | Pre-built security-hardened tool definitions |
| secureguard | Input/code/output guardrails (injection, PII, secrets) |
| orchestr | Graph-based agent orchestration |
| securecontext | Document chunking, embeddings, RAG retrieval |
| securetrace | Structured tracing, token/cost accounting, JSONL export |
| securebench | Guardrail benchmarking with precision/recall/F1 metrics |
Installation
# install.packages("pak")
pak::pak("ian-flores/securecontext")Features
- Document chunking – fixed-size, sentence, paragraph, and recursive strategies
- TF-IDF embeddings – local embeddings with no external API required
- Vector store – in-memory cosine similarity search with RDS persistence
- Knowledge store – persistent JSONL key-value storage
- Semantic retrieval – query documents by meaning
- Context builder – token-aware priority-based context assembly
- Integration helpers – works with orchestr and ellmer
Document Chunking
Split text into manageable pieces using one of four strategies. chunk_text() dispatches to the appropriate strategy function:
library(securecontext)
text <- "First paragraph with several sentences.\n\nSecond paragraph here.\n\nThird."
# Sentence-level splitting
chunk_text(text, strategy = "sentence")
#> [1] "First paragraph with several sentences." "Second paragraph here."
#> [3] "Third."
# Paragraph-level splitting
chunk_text(text, strategy = "paragraph")
#> [1] "First paragraph with several sentences." "Second paragraph here."
#> [3] "Third."
# Fixed-size chunks with overlap
chunk_fixed(paste(rep("word", 200), collapse = " "), size = 100, overlap = 10)
# Recursive splitting (tries paragraph -> newline -> sentence -> space)
chunk_recursive(text, max_size = 80)Knowledge Store
A persistent JSONL key-value store for agent memory. Entries are keyed strings with optional metadata and timestamps:
# In-memory store
ks <- knowledge_store$new()
# Persistent store backed by a JSONL file
ks <- knowledge_store$new(path = "agent-memory.jsonl")
# Store and retrieve values
ks$set("user_preference", "dark mode", metadata = list(source = "onboarding"))
ks$get("user_preference")
#> [1] "dark mode"
# Search keys by regex
ks$search("user_")
#> [1] "user_preference"
# List all keys and check size
ks$list()
ks$size()Context Builder
Assemble token-aware context for LLM prompts. Higher-priority items are included first; lower-priority items are dropped when the token budget is exceeded:
cb <- context_builder(max_tokens = 200)
cb <- cb_add(cb, "System instructions go here.", priority = 10, label = "system")
cb <- cb_add(cb, "Relevant retrieved passage.", priority = 5, label = "rag")
cb <- cb_add(cb, "Nice-to-have background info.", priority = 1, label = "background")
result <- cb_build(cb)
result$context # assembled text, highest priority first
result$included # labels of items that fit
result$excluded # labels of items that were dropped
result$total_tokens # token count of the assembled contextQuick start
library(securecontext)
# Create documents
docs <- list(
document("R is great for statistics.", metadata = list(topic = "R")),
document("Python excels at machine learning.", metadata = list(topic = "Python"))
)
# Build embeddings and index documents
emb <- embed_tfidf(vapply(docs, `[[`, character(1), "text"))
vs <- vector_store$new(dims = emb$dims)
ret <- retriever(vs, emb)
add_documents(ret, docs)
# Retrieve relevant context
result <- context_for_chat(ret, "statistical computing", max_tokens = 2000)
cat(result$context)Documentation
securecontext ships with three vignettes covering common workflows:
-
Getting Started with securecontext – package overview, core API, and basic retrieval (
vignette("securecontext")) -
Retrieval Workflows – end-to-end RAG patterns, chunking strategies, and vector store persistence (
vignette("retrieval-workflows")) -
RAG-Enabled Agents – integrating securecontext with orchestr for agent memory and context injection (
vignette("orchestr-integration"))
Full reference documentation is available at the pkgdown site.