Session Persistence Layer
Maintains user and chat message states locally via SQLite or PostgreSQL to isolate session metadata from core ContextUnity services.
ContextUnity RAG Extension is a specialized extension that provides an agnostic Retrieval-Augmented Generation (RAG) API and Chat Gateway for the ContextUnity ecosystem. Rather than just passing traffic, it acts as a robust state and interface layer — handling session persistence, citations distribution, frontend rendering, and multi-channel chat integrations (Web, Telegram, MCP).
Session Persistence Layer
Maintains user and chat message states locally via SQLite or PostgreSQL to isolate session metadata from core ContextUnity services.
Nuxt 3 AG-UI Frontend
A ready-to-use modern chat UI that supports markdown rendering, citations, search suggestions, and SSE streaming.
Agnostic Chat Gateway
HTTP + SSE API providing generic chat completions routing requests to the ContextRouter.
Telegram Bot Hub
A native webhook-based Telegram connector (aiogram) that streams ContextRouter outputs directly to Telegram users.
Model Context Protocol
FastMCP implementation making ContextUnity RAG Extension capabilities consumable by external agent systems.
ContextUnity RAG Extension acts as a Frontend Gateway and State Relayer. It enforces a strict separation of concerns from the semantic ML layers (taxonomies, vector chunking, graph search), which are exclusively handled by ContextRouter and ContextBrain.
graph TD Client[Web UI / Telegram] -->|POST /agui| RAG[ContextUnity RAG Extension API] RAG -->|RouterClient.stream_agent| Router[ContextRouter] Router -->|BrainClient.query| Brain[(ContextBrain DB)] Router -.->|SSE Events| RAG RAG -.->|Event Stream| Client RAG -->|save_message| SessionDB[(SQLite/PG Session Store)]/agui Streaming EndpointWhen a client application initiates a chat via POST /agui:
messages and a threadId (which acts as the session_id).ContextRouter via the async Python SDK (RouterClient().stream_agent).Server-Sent Events (SSE) back to the client.result event, ContextUnity RAG Extension intercepts the answer, citations, and searchSuggestions, and persists them into the local session database under the assistant system role./sessions SubsystemBecause SSE streaming pushes metadata incrementally, reloading the UI requires historically accurate transcripts. ContextUnity RAG Extension solves this by providing:
GET /sessions/{session_id}: Returns the clean chat transcript. To conserve UI payload size, complex metadata (like citations) is intentionally stripped.GET /sessions/{session_id}/citations: Resolves the specific assistant messageId entries and provides their corresponding citations, intent, and searchSuggestions so the UI can rehydrate source annotations non-blockingly./api)The core FastAPI framework built strictly with Pydantic V2.
RequestContextMiddleware).ctx.session_store interface, ensuring raw database operations never bleed into endpoint code./bot)Contains an independent entrypoint for Telegram interactions. It parses Telegram objects, translates them into the universal RAG payload (with corresponding threadId), and connects to the exact same stream_agent flow used by the Web UI. Responses are continuously formatted from SSE frames into Telegram message edits.
/ui)An agnostic frontend application. Rather than locking into a rigid system, the UI sends arbitrary content_configs to ContextUnity RAG Extension, which are swallowed by ContextRouter to dictate underlying agent behaviors without modifying the RAG gateway code.
/mcp)Exposes FastMCP tools, establishing ContextUnity RAG Extension as an agnostic knowledge conduit for desktop assistant orchestrators like Claude Desktop or Cursor.
The package relies on uv and mise for reproducible, isolated process management. Docker Compose is deliberately avoided for local dev cycles.
Sync dependencies:
mise run api-syncStart the API Server:
Spins up the FastAPI server on PORT 7200. ContextRouter bootstrapping is executed securely within the app lifespan.
mise run api-devStart the Telegram Bot:
mise run bot-devStart the UI:
mise run ui-installmise run ui-devContextUnity RAG Extension uses .env variables mapped into Pydantic Settings.
| Variable | Description | Default |
|---|---|---|
DB_TYPE | Type of DB for session persistence (sqlite, postgres) | sqlite |
POSTGRES_DSN | Connection string if using Postgres | <empty> |
TELEGRAM_BOT_TOKEN | Token for the Telegram webhook integration | <empty> |
Note: System credentials for routing external LLM requests are managed strictly by ContextRouter’s
project.yamlmanifests. ContextUnity RAG Extension only requires data-persistence environments.
You can use the ContextUnity RAG Extension within your project to handle chat streaming, citations, and metadata. Because ContextUnity follows a strict Service Mesh paradigm, your application does not spin up the Router or Brain directly. Instead, your project declares its capabilities (such as the retrieval_augmented template) and its API keys inside the standard contextunity.project.yaml manifest. ContextUnity RAG Extension simply acts as the conversational bridge.
Add a manifest to your project to configure exactly how your downstream retrieval_augmented graph operates.
apiVersion: "contextunity/v1alpha3"kind: "ContextUnityProject"
project: id: "my-rag-project" name: "My Internal RAG Portal" tenant: "my_company"
# Declare requirements for ContextUnity bootstrappingservices: router: { enabled: true } brain: { enabled: true } shield: { enabled: true }
router: graph: id: "my-rag-project" mode: "template" template: "retrieval_augmented" # The canonical ContextUnity RAG template
config: max_retries: 2 knowledge_domains: ["internal_docs", "hr_policy"] model: "openai/gpt-4o"
# Explicit node overrides for specific tasks within the RAG pipeline overrides: - name: "detect_intent" config: model: "openai/gpt-4o-mini" model_secret_ref: "OPENAI_API_KEY"
policy: tracing_enabled: true
secrets: - owner: "contextunity" resolver: "env" keys: - "OPENAI_API_KEY" - "POSTGRES_DSN"With your project communicating with ContextRouter, your UI simply points to the ContextUnity RAG Extension API via the /agui streaming endpoint. ContextUnity RAG Extension handles all stateless streaming, and automatically attaches the conversation to your defined retrieval_augmented graph.
const response = await fetch('http://localhost:7200/agui', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ messages: [ { role: "user", content: "What is ContextUnity?" } ], threadId: "session-123", content_configs: { "language": "ukrainian", "instruction": "Reply like a pirate." } })});