Graph Compiler & Templates
The Graph Compiler is the engine that transforms declarative YAML templates into executable LangGraph state machines. Instead of writing imperative Python code to wire graph nodes, you describe the pipeline topology in YAML — the compiler handles node creation, edge wiring, security enforcement, and configuration merging.
Why Templates?
Graphs are configuration, not code. Each node does one of three things:
| Node Type | What it does | Example |
|---|---|---|
platform | Calls an internal Router capability (LLM, retrieval, classification) | router_classify, router_generate |
federated | Executes a tool on the consumer project side via BiDi gRPC | export_products, store_news_results |
llm | Direct LLM invocation with prompt + model config | Custom classification prompt |
By treating graphs as YAML configurations, we gain:
- Reproducibility — same template produces identical graph topology
- Override safety — consumer projects can tune config values without touching node code
- Security by construction —
frozen=True+extra="forbid"Pydantic configs block runtime injection
Built-in Templates
retrieval_augmented (v1.0)
The universal memory-augmented generation pipeline. Handles both semantic vector retrieval (RAG) and exact SQL analytics through pluggable data sources.
__start__ → extract_query → detect_intent ├── retrieve → suggest ──┐ ├── ground ──────────────→ reflect → __end__ ├── generate ────────────→ reflect → __end__ ├── plan → execute_sql → verify → visualize → reflect └── no_results → __end__Key nodes:
router_extract_query— initialises state, extracts user queryrouter_detect_intent— LLM-based intent classification with taxonomy enrichmentrouter_retrieve— full RAG pipeline (vector search + reranking + graph facts)router_generate— multi-source prompt assembly with citation formattingrouter_sql_plan/router_sql_execute/router_sql_verify/router_sql_visualize— SQL analytics path
Usage: This is the default graph. In contextunity.project.yaml, set router.graph.main.template: "rag_retrieval" or "sql_analytics" — both resolve to this template with appropriate defaults.
gardener (v1.0)
Product normalisation pipeline composing classification with federated domain tools.
__start__ → fetch_products → deterministic_pass → classify → write_results → __end__fetch_products(federated) — exports products from consumer projectdeterministic_pass(federated) — runs domain-specific normalisation rules locallyclassify(platform:router_classify) — LLM taxonomy classificationwrite_results(federated) — writes normalised products back
Each node can use a different model. For example, classify can use a fast cheap model while the graph default stays premium:
defaults: model: openai/gpt-4o # graph defaultnodes: - name: classify type: platform tool_binding: router_classify config: model: vertex/gemini-2.5-flash # cheap, fast — good for classificationenricher (v1.0)
Product enrichment pipeline with content generation and quality review.
__start__ → prepare → enrich → review → __end__prepare(federated) — exports product context from consumerenrich(platform:router_generate_content) — LLM content generationreview(platform:router_review_content) — LLM quality review pass
rlm_bulk_matcher (v1.0)
RLM-based bulk product matching using Recursive Language Models for massive context (50k+ items).
__start__ → rlm_process → __end__rlm_process(platform:router_rlm_process) — RLM REPL session for bulk matching- Federated tools handle brand iteration, product export, and result upload on the consumer side
See Massive Context Tools (RLM) below for details.
news_pipeline (v1.0)
News processing pipeline with filtering, planning, and generation.
__start__ → harvest → filter → plan → generate → store → __end__harvest(federated) — fetches news from external sourcesfilter(platform:router_filter_content) — LLM content filteringplan(platform:router_plan_content) — editorial planninggenerate(platform:router_generate_content) — article generationstore(federated) — persists results to consumer project
Platform Tools
Platform tools are the atomic LLM capabilities that templates compose. Each tool:
- Has a Pydantic config schema (
frozen=True,extra="forbid") - Requires
router:executetoken scope - Wraps all exceptions in
PlatformServiceError - Has zero domain imports — pure LLM capability
Universal Content Tools
| Binding | Capability | Key Config |
|---|---|---|
router_classify | Taxonomy / intent classification | taxonomy_key, confidence_threshold, response_format |
router_generate_content | Structured content generation | language, max_tokens, response_format |
router_review_content | Quality review + correction | strict_mode, language |
router_filter_content | Content filtering / validation | criteria_key, pass_threshold |
router_plan_content | Editorial / batch planning | strategy (editorial, chronological, priority), max_items |
router_match_semantic | Semantic similarity matching | threshold, max_candidates |
router_rlm_process | Massive context RLM execution (50k+ items) | rlm_model, rlm_environment, max_iterations, max_timeout |
RAG Pipeline Tools
| Binding | Capability |
|---|---|
router_extract_query | Query extraction + state initialisation |
router_detect_intent | Intent classification with taxonomy enrichment |
router_retrieve | Full RAG pipeline (vector + rerank + graph) |
router_ground | Google Search grounding |
router_generate | RAG response generation with citations |
router_reflect | Self-evaluation and quality scoring |
router_suggest | Search suggestions |
router_no_results | Empathetic no-results response |
SQL Analytics Tools
| Binding | Capability |
|---|---|
router_sql_plan | SQL generation from natural language |
router_sql_execute | SQL execution with safety limits |
router_sql_verify | Result verification |
router_sql_visualize | Output formatting (table, chart, markdown) |
Massive Context Tools (RLM)
Recursive Language Models are a task-agnostic inference paradigm that wraps any base LLM with a REPL environment. Instead of cramming 50k items into a context window, RLM stores them as Python variables and lets the model programmatically examine, filter, and recursively call itself.
router_rlm_process is the universal platform tool for this capability. It is not domain-specific — any project can use it for:
- Product matching — 50k supplier→10k site (how Commerce uses it today)
- Taxonomy classification — 1000+ category trees navigated via code
- Bulk deduplication — entity resolution across large datasets
- Large-scale analysis — any task where context degradation kills regular LLMs
How Commerce uses RLM for matching:
The commerce matcher (rlm_bulk_matcher graph) runs a brand-by-brand BiDi loop:
- Fetch matchable brands via BiDi → consumer exports brand list
- For each brand: fetch supplier + site products via BiDi (small payload per brand)
- Call
router_rlm_processwith matching prompt + product data → RLM generates Python code to compare products, index by SKU/brand, and recursively resolve ambiguous cases - Upload results via BiDi → consumer persists matches
The architecture separates concerns cleanly:
router_rlm_process(platform) — pure RLM execution, any task, any model- Commerce
rlm_bulk_match_node(domain) — BiDi orchestration, brand iteration, taxonomy/manual-match injection, result upload. This is the domain logic that stays in the commerce extension.
RLM model selection uses the same hierarchy as regular models:
graph: rlm_bulk_matcher: template: "yaml:rlm_bulk_matcher" overrides: defaults: model: rlm/gpt-5-mini # default for all nodes nodes: rlm_process: config: rlm_environment: docker # isolated REPL max_iterations: 15 max_timeout: 300The rlm/ prefix tells the model registry to wrap the base model (gpt-5-mini, claude-sonnet, gemini) with the RLM REPL layer. Any base model becomes RLM-capable.
Calling RLM: Direct Python API
Use this inside imperative graph nodes (like Commerce’s rlm_bulk_match_node):
from contextunity.router.modules.models import model_registryfrom contextunity.router.modules.models.types import ModelRequest, TextPart
# Create RLM-wrapped model — "rlm/" prefix activates the REPL layermodel = model_registry.create_llm( "rlm/gpt-5-mini", # any base model: gpt-5-mini, claude-sonnet, gemini-2.5-flash config=config, environment="docker", # local | docker | modal | prime verbose=True,)
# Key insight: data goes into custom_tools, NOT into the prompt.# RLM stores these as Python variables the model can examine programmatically.result = await model.generate( ModelRequest( system="You are a product matching expert. Write Python code to analyze and match products.", parts=[TextPart(text=matching_prompt)], temperature=0.3, max_output_tokens=50000, ), custom_tools={ "supplier_products": supplier_list, # 50k items — Python variable, not prompt text "site_products": site_list, # 10k items — Python variable "taxonomies": taxonomy_dict, # optional context "out_path": "/tmp/rlm_matches.json", # output file for structured results },)# result.text contains the RLM's final output# result.usage contains token consumption across all recursive callsCalling RLM: YAML Template Node
Use router_rlm_process as a platform tool in a YAML template:
name: bulk_classifierversion: "1.0"description: > Classify 1000+ products against a deep taxonomy tree using RLM.
defaults: model: rlm/gemini-2.5-flash # RLM wraps Gemini for recursive classification
nodes: - name: fetch_products type: federated tool_binding: export_products_for_classification
- name: classify type: platform tool_binding: router_rlm_process model: rlm/gpt-5-mini # per-node override if needed config: rlm_environment: docker max_iterations: 15 max_timeout: 300 task_type: classification # hint for prompt assembly
- name: write_results type: federated tool_binding: update_classified_products
edges: - from: __start__ to: fetch_products - from: fetch_products to: classify - from: classify to: write_results - from: write_results to: __end__Calling RLM: Multi-Graph Manifest
Register a dedicated RLM graph for the consumer project:
router: default_graph: retrieval_augmented
graph: rlm_bulk_matcher: template: "yaml:rlm_bulk_matcher" overrides: defaults: model: rlm/gpt-5-mini nodes: rlm_process: config: rlm_environment: docker max_iterations: 15 federated_tools: export_unmatched_products: handler: "commerce.matcher.export_unmatched" export_site_products: handler: "commerce.matcher.export_site" export_taxonomies: handler: "commerce.matcher.export_taxonomies" bulk_link_products: handler: "commerce.matcher.bulk_link"Then trigger from the UI or API:
# Consumer-side: triggers the registered RLM graphawait client.execute_agent( graph="rlm_bulk_matcher", payload={"dealer_code": "acme", "target_brand": "SALOMON"},)Service Integration Tools
| Binding | Service | Capability |
|---|---|---|
brain_search | Brain | Vector search via Brain gRPC |
brain_memory_read / brain_memory_write | Brain | Conversation memory |
shield_scan | Shield | Input scanning for prompt injection |
worker_start_workflow / worker_execute_code | Worker | Durable workflow execution / sandbox code |
zero_scan_pii | Zero | PII detection and redaction |
language_tool_check | Language | Grammar / spelling verification |
router_web_search | Router | Web search grounding |
Model Resolution
Models are resolved through a 3-level hierarchy — the first non-empty value wins:
Per-node model → Graph defaults.model → Router CU_ROUTER_DEFAULT_LLMLevel 1: Per-node model
Override the model for a specific node:
nodes: - name: classify type: platform tool_binding: router_classify model: vertex/gemini-2.5-flash # ← this node uses GeminiLevel 2: Graph defaults.model
Default model for all nodes in the template that don’t specify their own:
defaults: model: openai/gpt-4o # ← default for all nodes temperature: 0.2Level 3: Router default
If neither the node nor the template specifies a model, the Router’s global default is used:
export CU_ROUTER_DEFAULT_LLM="openai/gpt-5-mini"Practical example: Commerce multi-graph
graph: gardener: template: "yaml:gardener" overrides: defaults: model: vertex/gemini-2.5-flash # fast + cheap for classification
enricher: template: "yaml:enricher" overrides: defaults: model: openai/gpt-4o # quality for content generation nodes: review: model: anthropic/claude-sonnet-4 # Claude is strong at review
rlm_bulk_matcher: template: "yaml:rlm_bulk_matcher" overrides: defaults: model: rlm/gpt-5-mini # RLM wraps gpt-5-mini with REPLEach graph selects the best model for its workload. Within a graph, individual nodes can override further. The consumer never touches Router configuration — everything is in the manifest.
Secret resolution
Models that need API keys follow the same 3-level pattern with model_secret_ref:
nodes: - name: process type: llm model: openai/gpt-4o model_secret_ref: CU_ROUTER_OPENAI_KEY # env var holding the API keyWhen Shield is available, secrets are resolved via PutSecret/GetSecret. When unavailable, they fall back to the project’s os.environ. See the Security documentation for details.
Writing a Template
A template is a YAML file in cortex/templates/:
name: my_pipelineversion: "1.0"description: > My custom pipeline description.
defaults: model: openai/gpt-4o temperature: 0.2
nodes: - name: extract type: platform tool_binding: router_extract_query config: output_mode: direct
- name: process type: platform tool_binding: router_classify config: output_mode: direct taxonomy_key: my_taxonomy response_format: json
- name: export type: federated tool_binding: my_export_tool config: output_mode: direct
edges: - from: __start__ to: extract - from: extract to: process - from: process to: export - from: export to: __end__
config: max_retries: 2 timeout: 120Conditional Edges
Use condition_key and condition_map for routing:
edges: - from: detect_intent condition_key: intent_route condition_map: retrieve: retrieve sql_analytics: plan no_results: no_resultsThe node must set state["intent_route"] to one of the map keys. The compiler generates a LangGraph conditional edge function automatically.
Consumer Overrides (Single Graph)
When a consumer project connects via contextunity.project.yaml with a single graph, it can override template config values:
router: graph: template: "yaml:retrieval_augmented" overrides: defaults: model: "anthropic/claude-sonnet-4" nodes: detect_intent: config: temperature: 0.1Overrides are merged at compile time. The template’s frozen config schemas validate the final merged result — unknown fields are rejected.
Multi-Graph Manifest
Projects requiring multiple entry points (e.g., commerce with gardener, enricher, matcher, writer) declare multiple entries in the graph map:
router: default_graph: retrieval_augmented
graph: gardener: template: "yaml:gardener" overrides: defaults: model: "vertex/gemini-2.5-flash" federated_tools: export_products_for_normalization: handler: "commerce.gardener.export_products" run_deterministic_pass: handler: "commerce.gardener.deterministic_normalize" update_normalized_products: handler: "commerce.gardener.update_products"
enricher: template: "yaml:enricher" federated_tools: export_product_for_enrichment: handler: "commerce.enricher.export_context"
retrieval_augmented: template: "yaml:retrieval_augmented" federated_tools: search_catalog: handler: "commerce.search.catalog_search"Key properties:
| Property | Meaning |
|---|---|
default_graph | Graph used by StreamDispatcher() when caller doesn’t specify |
graph.<name>.template | YAML template to compile |
graph.<name>.overrides | Per-graph config overrides (merged at compile time) |
graph.<name>.federated_tools | Tools scoped to this graph ONLY — not shared across graphs |
How it works:
- At
RegisterTools, Router loads each graph’s template and validates that alltype: federatedtool bindings have matching handler declarations — fail-fast on mismatch - Each graph is compiled independently with its own overrides
- Consumer calls
ExecuteAgent(graph="gardener")orExecuteAgent(graph="enricher")— Router resolves the pre-compiled graph by(tenant_id, graph_name) StreamDispatcher()uses thedefault_graph
Tool isolation: Federated tools in the gardener graph are NOT available in the enricher graph. This prevents accidental cross-graph tool leakage and ensures each graph’s security boundary is self-contained.
Security Model
SecureNode Wrapping
Every compiled node is wrapped via make_secure_node():
- Token extraction from state
- Scope verification against the tool’s
required_scopes - Token attenuation — each node receives a scoped-down token with only the permissions it needs
- Execution — the node runs with attenuated permissions
- Error boundary — exceptions are caught and wrapped in typed errors
Worker Callbacks and ExecuteNode (Isolated Execution)
Worker processes (such as Temporal workflows executing async batch operations) sometimes need to temporarily suspend execution, send a specific node configuration back to Router for contextual LLM processing, and wait for its completion.
This is handled via the gRPC endpoint ExecuteNode.
To ensure callers cannot blindly execute any node they want out of context, the ExecuteNode architecture enforces an explicit allow-list:
- Manifest Exposure: A graph manifest must explicitly define which nodes are safe for standalone invocation by specifying them in the
router_callbackslist at the graph root level.router:graph:gardener:template: yaml:gardenerrouter_callbacks: ["classify"] # Only the 'classify' node can be called directly - Access Security: The client must supply an attenuated token bearing the
router:execute_nodescope and validating propertenant_idbindings.
Config Immutability
Template configs are validated through Pydantic models with:
frozen=True— prevents runtime mutationextra="forbid"— rejects unknown fields (blocks injection attacks)- Bounded fields (
ge,le) — prevents resource exhaustion
Compilation Safety
The compiler enforces:
- Cycle detection — DAG validation prevents infinite loops
- Phantom node rejection — edges cannot reference undefined nodes
- Type validation — only
platform,federated, andllmnode types are accepted - Prefix validation — tool bindings must start with a known service prefix
File Layout
cortex/graphs/compiler/├── builder.py # build_from_template(), build_custom_graph()├── template_loader.py # YAML parsing, Pydantic validation├── platform_registry.py # PlatformToolRegistry singleton├── validation.py # DAG validation, cycle detection├── platform_tools/ # 22 individual modules (one per capability)│ ├── __init__.py # register_all_platform_tools() — wires 29 bindings│ ├── extract.py # router_extract_query│ ├── intent.py # router_detect_intent│ ├── retrieve.py # router_retrieve│ ├── ground.py # router_ground│ ├── generate.py # router_generate│ ├── reflect.py # router_reflect│ ├── suggest.py # router_suggest│ ├── no_results.py # router_no_results│ ├── synthesizer.py # router_sql_plan│ ├── visualize.py # router_sql_visualize│ ├── formatter.py # router_sql_format│ ├── memory.py # router_memory_*│ ├── content.py # 7 universal content tools│ ├── rlm.py # router_rlm_process│ ├── language.py # language_tool_check│ ├── brain.py # brain_search, brain_memory_*│ ├── shield.py # shield_scan│ ├── worker.py # worker_start_workflow, worker_execute_code│ ├── zero.py # zero_scan_pii│ └── web_search.py # router_web_search
cortex/templates/├── retrieval_augmented.yaml├── gardener.yaml├── enricher.yaml├── rlm_bulk_matcher.yaml└── news_pipeline.yaml