Skip to content

Graph Compiler & Templates

The Graph Compiler is the engine that transforms declarative YAML templates into executable LangGraph state machines. Instead of writing imperative Python code to wire graph nodes, you describe the pipeline topology in YAML — the compiler handles node creation, edge wiring, security enforcement, and configuration merging.

Why Templates?

Graphs are configuration, not code. Each node does one of three things:

Node TypeWhat it doesExample
platformCalls an internal Router capability (LLM, retrieval, classification)router_classify, router_generate
federatedExecutes a tool on the consumer project side via BiDi gRPCexport_products, store_news_results
llmDirect LLM invocation with prompt + model configCustom classification prompt

By treating graphs as YAML configurations, we gain:

  • Reproducibility — same template produces identical graph topology
  • Override safety — consumer projects can tune config values without touching node code
  • Security by constructionfrozen=True + extra="forbid" Pydantic configs block runtime injection

Built-in Templates

retrieval_augmented (v1.0)

The universal memory-augmented generation pipeline. Handles both semantic vector retrieval (RAG) and exact SQL analytics through pluggable data sources.

__start__ → extract_query → detect_intent
├── retrieve → suggest ──┐
├── ground ──────────────→ reflect → __end__
├── generate ────────────→ reflect → __end__
├── plan → execute_sql → verify → visualize → reflect
└── no_results → __end__

Key nodes:

  • router_extract_query — initialises state, extracts user query
  • router_detect_intent — LLM-based intent classification with taxonomy enrichment
  • router_retrieve — full RAG pipeline (vector search + reranking + graph facts)
  • router_generate — multi-source prompt assembly with citation formatting
  • router_sql_plan / router_sql_execute / router_sql_verify / router_sql_visualize — SQL analytics path

Usage: This is the default graph. In contextunity.project.yaml, set router.graph.main.template: "rag_retrieval" or "sql_analytics" — both resolve to this template with appropriate defaults.

gardener (v1.0)

Product normalisation pipeline composing classification with federated domain tools.

__start__ → fetch_products → deterministic_pass → classify → write_results → __end__
  • fetch_products (federated) — exports products from consumer project
  • deterministic_pass (federated) — runs domain-specific normalisation rules locally
  • classify (platform: router_classify) — LLM taxonomy classification
  • write_results (federated) — writes normalised products back

Each node can use a different model. For example, classify can use a fast cheap model while the graph default stays premium:

defaults:
model: openai/gpt-4o # graph default
nodes:
- name: classify
type: platform
tool_binding: router_classify
config:
model: vertex/gemini-2.5-flash # cheap, fast — good for classification

enricher (v1.0)

Product enrichment pipeline with content generation and quality review.

__start__ → prepare → enrich → review → __end__
  • prepare (federated) — exports product context from consumer
  • enrich (platform: router_generate_content) — LLM content generation
  • review (platform: router_review_content) — LLM quality review pass

rlm_bulk_matcher (v1.0)

RLM-based bulk product matching using Recursive Language Models for massive context (50k+ items).

__start__ → rlm_process → __end__
  • rlm_process (platform: router_rlm_process) — RLM REPL session for bulk matching
  • Federated tools handle brand iteration, product export, and result upload on the consumer side

See Massive Context Tools (RLM) below for details.

news_pipeline (v1.0)

News processing pipeline with filtering, planning, and generation.

__start__ → harvest → filter → plan → generate → store → __end__
  • harvest (federated) — fetches news from external sources
  • filter (platform: router_filter_content) — LLM content filtering
  • plan (platform: router_plan_content) — editorial planning
  • generate (platform: router_generate_content) — article generation
  • store (federated) — persists results to consumer project

Platform Tools

Platform tools are the atomic LLM capabilities that templates compose. Each tool:

  • Has a Pydantic config schema (frozen=True, extra="forbid")
  • Requires router:execute token scope
  • Wraps all exceptions in PlatformServiceError
  • Has zero domain imports — pure LLM capability

Universal Content Tools

BindingCapabilityKey Config
router_classifyTaxonomy / intent classificationtaxonomy_key, confidence_threshold, response_format
router_generate_contentStructured content generationlanguage, max_tokens, response_format
router_review_contentQuality review + correctionstrict_mode, language
router_filter_contentContent filtering / validationcriteria_key, pass_threshold
router_plan_contentEditorial / batch planningstrategy (editorial, chronological, priority), max_items
router_match_semanticSemantic similarity matchingthreshold, max_candidates
router_rlm_processMassive context RLM execution (50k+ items)rlm_model, rlm_environment, max_iterations, max_timeout

RAG Pipeline Tools

BindingCapability
router_extract_queryQuery extraction + state initialisation
router_detect_intentIntent classification with taxonomy enrichment
router_retrieveFull RAG pipeline (vector + rerank + graph)
router_groundGoogle Search grounding
router_generateRAG response generation with citations
router_reflectSelf-evaluation and quality scoring
router_suggestSearch suggestions
router_no_resultsEmpathetic no-results response

SQL Analytics Tools

BindingCapability
router_sql_planSQL generation from natural language
router_sql_executeSQL execution with safety limits
router_sql_verifyResult verification
router_sql_visualizeOutput formatting (table, chart, markdown)

Massive Context Tools (RLM)

Recursive Language Models are a task-agnostic inference paradigm that wraps any base LLM with a REPL environment. Instead of cramming 50k items into a context window, RLM stores them as Python variables and lets the model programmatically examine, filter, and recursively call itself.

router_rlm_process is the universal platform tool for this capability. It is not domain-specific — any project can use it for:

  • Product matching — 50k supplier→10k site (how Commerce uses it today)
  • Taxonomy classification — 1000+ category trees navigated via code
  • Bulk deduplication — entity resolution across large datasets
  • Large-scale analysis — any task where context degradation kills regular LLMs

How Commerce uses RLM for matching:

The commerce matcher (rlm_bulk_matcher graph) runs a brand-by-brand BiDi loop:

  1. Fetch matchable brands via BiDi → consumer exports brand list
  2. For each brand: fetch supplier + site products via BiDi (small payload per brand)
  3. Call router_rlm_process with matching prompt + product data → RLM generates Python code to compare products, index by SKU/brand, and recursively resolve ambiguous cases
  4. Upload results via BiDi → consumer persists matches

The architecture separates concerns cleanly:

  • router_rlm_process (platform) — pure RLM execution, any task, any model
  • Commerce rlm_bulk_match_node (domain) — BiDi orchestration, brand iteration, taxonomy/manual-match injection, result upload. This is the domain logic that stays in the commerce extension.

RLM model selection uses the same hierarchy as regular models:

graph:
rlm_bulk_matcher:
template: "yaml:rlm_bulk_matcher"
overrides:
defaults:
model: rlm/gpt-5-mini # default for all nodes
nodes:
rlm_process:
config:
rlm_environment: docker # isolated REPL
max_iterations: 15
max_timeout: 300

The rlm/ prefix tells the model registry to wrap the base model (gpt-5-mini, claude-sonnet, gemini) with the RLM REPL layer. Any base model becomes RLM-capable.

Calling RLM: Direct Python API

Use this inside imperative graph nodes (like Commerce’s rlm_bulk_match_node):

from contextunity.router.modules.models import model_registry
from contextunity.router.modules.models.types import ModelRequest, TextPart
# Create RLM-wrapped model — "rlm/" prefix activates the REPL layer
model = model_registry.create_llm(
"rlm/gpt-5-mini", # any base model: gpt-5-mini, claude-sonnet, gemini-2.5-flash
config=config,
environment="docker", # local | docker | modal | prime
verbose=True,
)
# Key insight: data goes into custom_tools, NOT into the prompt.
# RLM stores these as Python variables the model can examine programmatically.
result = await model.generate(
ModelRequest(
system="You are a product matching expert. Write Python code to analyze and match products.",
parts=[TextPart(text=matching_prompt)],
temperature=0.3,
max_output_tokens=50000,
),
custom_tools={
"supplier_products": supplier_list, # 50k items — Python variable, not prompt text
"site_products": site_list, # 10k items — Python variable
"taxonomies": taxonomy_dict, # optional context
"out_path": "/tmp/rlm_matches.json", # output file for structured results
},
)
# result.text contains the RLM's final output
# result.usage contains token consumption across all recursive calls

Calling RLM: YAML Template Node

Use router_rlm_process as a platform tool in a YAML template:

cortex/templates/bulk_classifier.yaml
name: bulk_classifier
version: "1.0"
description: >
Classify 1000+ products against a deep taxonomy tree using RLM.
defaults:
model: rlm/gemini-2.5-flash # RLM wraps Gemini for recursive classification
nodes:
- name: fetch_products
type: federated
tool_binding: export_products_for_classification
- name: classify
type: platform
tool_binding: router_rlm_process
model: rlm/gpt-5-mini # per-node override if needed
config:
rlm_environment: docker
max_iterations: 15
max_timeout: 300
task_type: classification # hint for prompt assembly
- name: write_results
type: federated
tool_binding: update_classified_products
edges:
- from: __start__
to: fetch_products
- from: fetch_products
to: classify
- from: classify
to: write_results
- from: write_results
to: __end__

Calling RLM: Multi-Graph Manifest

Register a dedicated RLM graph for the consumer project:

contextunity.project.yaml
router:
default_graph: retrieval_augmented
graph:
rlm_bulk_matcher:
template: "yaml:rlm_bulk_matcher"
overrides:
defaults:
model: rlm/gpt-5-mini
nodes:
rlm_process:
config:
rlm_environment: docker
max_iterations: 15
federated_tools:
export_unmatched_products:
handler: "commerce.matcher.export_unmatched"
export_site_products:
handler: "commerce.matcher.export_site"
export_taxonomies:
handler: "commerce.matcher.export_taxonomies"
bulk_link_products:
handler: "commerce.matcher.bulk_link"

Then trigger from the UI or API:

# Consumer-side: triggers the registered RLM graph
await client.execute_agent(
graph="rlm_bulk_matcher",
payload={"dealer_code": "acme", "target_brand": "SALOMON"},
)

Service Integration Tools

BindingServiceCapability
brain_searchBrainVector search via Brain gRPC
brain_memory_read / brain_memory_writeBrainConversation memory
shield_scanShieldInput scanning for prompt injection
worker_start_workflow / worker_execute_codeWorkerDurable workflow execution / sandbox code
zero_scan_piiZeroPII detection and redaction
language_tool_checkLanguageGrammar / spelling verification
router_web_searchRouterWeb search grounding

Model Resolution

Models are resolved through a 3-level hierarchy — the first non-empty value wins:

Per-node model → Graph defaults.model → Router CU_ROUTER_DEFAULT_LLM

Level 1: Per-node model

Override the model for a specific node:

nodes:
- name: classify
type: platform
tool_binding: router_classify
model: vertex/gemini-2.5-flash # ← this node uses Gemini

Level 2: Graph defaults.model

Default model for all nodes in the template that don’t specify their own:

defaults:
model: openai/gpt-4o # ← default for all nodes
temperature: 0.2

Level 3: Router default

If neither the node nor the template specifies a model, the Router’s global default is used:

Terminal window
export CU_ROUTER_DEFAULT_LLM="openai/gpt-5-mini"

Practical example: Commerce multi-graph

graph:
gardener:
template: "yaml:gardener"
overrides:
defaults:
model: vertex/gemini-2.5-flash # fast + cheap for classification
enricher:
template: "yaml:enricher"
overrides:
defaults:
model: openai/gpt-4o # quality for content generation
nodes:
review:
model: anthropic/claude-sonnet-4 # Claude is strong at review
rlm_bulk_matcher:
template: "yaml:rlm_bulk_matcher"
overrides:
defaults:
model: rlm/gpt-5-mini # RLM wraps gpt-5-mini with REPL

Each graph selects the best model for its workload. Within a graph, individual nodes can override further. The consumer never touches Router configuration — everything is in the manifest.

Secret resolution

Models that need API keys follow the same 3-level pattern with model_secret_ref:

nodes:
- name: process
type: llm
model: openai/gpt-4o
model_secret_ref: CU_ROUTER_OPENAI_KEY # env var holding the API key

When Shield is available, secrets are resolved via PutSecret/GetSecret. When unavailable, they fall back to the project’s os.environ. See the Security documentation for details.

Writing a Template

A template is a YAML file in cortex/templates/:

name: my_pipeline
version: "1.0"
description: >
My custom pipeline description.
defaults:
model: openai/gpt-4o
temperature: 0.2
nodes:
- name: extract
type: platform
tool_binding: router_extract_query
config:
output_mode: direct
- name: process
type: platform
tool_binding: router_classify
config:
output_mode: direct
taxonomy_key: my_taxonomy
response_format: json
- name: export
type: federated
tool_binding: my_export_tool
config:
output_mode: direct
edges:
- from: __start__
to: extract
- from: extract
to: process
- from: process
to: export
- from: export
to: __end__
config:
max_retries: 2
timeout: 120

Conditional Edges

Use condition_key and condition_map for routing:

edges:
- from: detect_intent
condition_key: intent_route
condition_map:
retrieve: retrieve
sql_analytics: plan
no_results: no_results

The node must set state["intent_route"] to one of the map keys. The compiler generates a LangGraph conditional edge function automatically.

Consumer Overrides (Single Graph)

When a consumer project connects via contextunity.project.yaml with a single graph, it can override template config values:

router:
graph:
template: "yaml:retrieval_augmented"
overrides:
defaults:
model: "anthropic/claude-sonnet-4"
nodes:
detect_intent:
config:
temperature: 0.1

Overrides are merged at compile time. The template’s frozen config schemas validate the final merged result — unknown fields are rejected.

Multi-Graph Manifest

Projects requiring multiple entry points (e.g., commerce with gardener, enricher, matcher, writer) declare multiple entries in the graph map:

router:
default_graph: retrieval_augmented
graph:
gardener:
template: "yaml:gardener"
overrides:
defaults:
model: "vertex/gemini-2.5-flash"
federated_tools:
export_products_for_normalization:
handler: "commerce.gardener.export_products"
run_deterministic_pass:
handler: "commerce.gardener.deterministic_normalize"
update_normalized_products:
handler: "commerce.gardener.update_products"
enricher:
template: "yaml:enricher"
federated_tools:
export_product_for_enrichment:
handler: "commerce.enricher.export_context"
retrieval_augmented:
template: "yaml:retrieval_augmented"
federated_tools:
search_catalog:
handler: "commerce.search.catalog_search"

Key properties:

PropertyMeaning
default_graphGraph used by StreamDispatcher() when caller doesn’t specify
graph.<name>.templateYAML template to compile
graph.<name>.overridesPer-graph config overrides (merged at compile time)
graph.<name>.federated_toolsTools scoped to this graph ONLY — not shared across graphs

How it works:

  1. At RegisterTools, Router loads each graph’s template and validates that all type: federated tool bindings have matching handler declarations — fail-fast on mismatch
  2. Each graph is compiled independently with its own overrides
  3. Consumer calls ExecuteAgent(graph="gardener") or ExecuteAgent(graph="enricher") — Router resolves the pre-compiled graph by (tenant_id, graph_name)
  4. StreamDispatcher() uses the default_graph

Tool isolation: Federated tools in the gardener graph are NOT available in the enricher graph. This prevents accidental cross-graph tool leakage and ensures each graph’s security boundary is self-contained.

Security Model

SecureNode Wrapping

Every compiled node is wrapped via make_secure_node():

  1. Token extraction from state
  2. Scope verification against the tool’s required_scopes
  3. Token attenuation — each node receives a scoped-down token with only the permissions it needs
  4. Execution — the node runs with attenuated permissions
  5. Error boundary — exceptions are caught and wrapped in typed errors

Worker Callbacks and ExecuteNode (Isolated Execution)

Worker processes (such as Temporal workflows executing async batch operations) sometimes need to temporarily suspend execution, send a specific node configuration back to Router for contextual LLM processing, and wait for its completion. This is handled via the gRPC endpoint ExecuteNode.

To ensure callers cannot blindly execute any node they want out of context, the ExecuteNode architecture enforces an explicit allow-list:

  1. Manifest Exposure: A graph manifest must explicitly define which nodes are safe for standalone invocation by specifying them in the router_callbacks list at the graph root level.
    router:
    graph:
    gardener:
    template: yaml:gardener
    router_callbacks: ["classify"] # Only the 'classify' node can be called directly
  2. Access Security: The client must supply an attenuated token bearing the router:execute_node scope and validating proper tenant_id bindings.

Config Immutability

Template configs are validated through Pydantic models with:

  • frozen=True — prevents runtime mutation
  • extra="forbid" — rejects unknown fields (blocks injection attacks)
  • Bounded fields (ge, le) — prevents resource exhaustion

Compilation Safety

The compiler enforces:

  • Cycle detection — DAG validation prevents infinite loops
  • Phantom node rejection — edges cannot reference undefined nodes
  • Type validation — only platform, federated, and llm node types are accepted
  • Prefix validation — tool bindings must start with a known service prefix

File Layout

cortex/graphs/compiler/
├── builder.py # build_from_template(), build_custom_graph()
├── template_loader.py # YAML parsing, Pydantic validation
├── platform_registry.py # PlatformToolRegistry singleton
├── validation.py # DAG validation, cycle detection
├── platform_tools/ # 22 individual modules (one per capability)
│ ├── __init__.py # register_all_platform_tools() — wires 29 bindings
│ ├── extract.py # router_extract_query
│ ├── intent.py # router_detect_intent
│ ├── retrieve.py # router_retrieve
│ ├── ground.py # router_ground
│ ├── generate.py # router_generate
│ ├── reflect.py # router_reflect
│ ├── suggest.py # router_suggest
│ ├── no_results.py # router_no_results
│ ├── synthesizer.py # router_sql_plan
│ ├── visualize.py # router_sql_visualize
│ ├── formatter.py # router_sql_format
│ ├── memory.py # router_memory_*
│ ├── content.py # 7 universal content tools
│ ├── rlm.py # router_rlm_process
│ ├── language.py # language_tool_check
│ ├── brain.py # brain_search, brain_memory_*
│ ├── shield.py # shield_scan
│ ├── worker.py # worker_start_workflow, worker_execute_code
│ ├── zero.py # zero_scan_pii
│ └── web_search.py # router_web_search
cortex/templates/
├── retrieval_augmented.yaml
├── gardener.yaml
├── enricher.yaml
├── rlm_bulk_matcher.yaml
└── news_pipeline.yaml