Graph Compiler & Templates

The Graph Compiler is the engine that transforms declarative YAML templates into executable LangGraph state machines. Instead of writing imperative Python code to wire graph nodes, you describe the pipeline topology in YAML — the compiler handles node creation, edge wiring, security enforcement, and configuration merging.

Why Templates?

Graphs are configuration, not code. Each node does one of three things:

Node Type	What it does	Example
`platform`	Calls an internal Router capability (LLM, retrieval, classification)	`router_classify`, `router_generate`
`federated`	Executes a tool on the consumer project side via BiDi gRPC	`export_products`, `store_news_results`
`llm`	Direct LLM invocation with prompt + model config	Custom classification prompt

By treating graphs as YAML configurations, we gain:

Reproducibility — same template produces identical graph topology
Override safety — consumer projects can tune config values without touching node code
Security by construction — frozen=True + extra="forbid" Pydantic configs block runtime injection

Built-in Templates

`retrieval_augmented` (v1.0)

The universal memory-augmented generation pipeline. Handles both semantic vector retrieval (RAG) and exact SQL analytics through pluggable data sources.

__start__ → extract_query → detect_intent
                                ├── retrieve → suggest ──┐
                                ├── ground ──────────────→ reflect → __end__
                                ├── generate ────────────→ reflect → __end__
                                ├── plan → execute_sql → verify → visualize → reflect
                                └── no_results → __end__

Key nodes:

router_extract_query — initialises state, extracts user query
router_detect_intent — LLM-based intent classification with taxonomy enrichment
router_retrieve — full RAG pipeline (vector search + reranking + graph facts)
router_generate — multi-source prompt assembly with citation formatting
router_sql_plan / router_sql_execute / router_sql_verify / router_sql_visualize — SQL analytics path

Usage: This is the default graph. In contextunity.project.yaml, set router.graph.main.template: "rag_retrieval" or "sql_analytics" — both resolve to this template with appropriate defaults.

`gardener` (v1.0)

Product normalisation pipeline composing classification with federated domain tools.

__start__ → fetch_products → deterministic_pass → classify → write_results → __end__

fetch_products (federated) — exports products from consumer project
deterministic_pass (federated) — runs domain-specific normalisation rules locally
classify (platform: router_classify) — LLM taxonomy classification
write_results (federated) — writes normalised products back

Each node can use a different model. For example, classify can use a fast cheap model while the graph default stays premium:

defaults:
  model: openai/gpt-4o              # graph default
nodes:
  - name: classify
    type: platform
    tool_binding: router_classify
    config:
      model: vertex/gemini-2.5-flash  # cheap, fast — good for classification

`enricher` (v1.0)

Product enrichment pipeline with content generation and quality review.

__start__ → prepare → enrich → review → __end__

prepare (federated) — exports product context from consumer
enrich (platform: router_generate_content) — LLM content generation
review (platform: router_review_content) — LLM quality review pass

`rlm_bulk_matcher` (v1.0)

RLM-based bulk product matching using Recursive Language Models for massive context (50k+ items).

__start__ → rlm_process → __end__

rlm_process (platform: router_rlm_process) — RLM REPL session for bulk matching
Federated tools handle brand iteration, product export, and result upload on the consumer side

See Massive Context Tools (RLM) below for details.

`news_pipeline` (v1.0)

News processing pipeline with filtering, planning, and generation.

__start__ → harvest → filter → plan → generate → store → __end__

harvest (federated) — fetches news from external sources
filter (platform: router_filter_content) — LLM content filtering
plan (platform: router_plan_content) — editorial planning
generate (platform: router_generate_content) — article generation
store (federated) — persists results to consumer project

Platform Tools

Platform tools are the atomic LLM capabilities that templates compose. Each tool:

Has a Pydantic config schema (frozen=True, extra="forbid")
Requires router:execute token scope
Wraps all exceptions in PlatformServiceError
Has zero domain imports — pure LLM capability

Universal Content Tools

Binding	Capability	Key Config
`router_classify`	Taxonomy / intent classification	`taxonomy_key`, `confidence_threshold`, `response_format`
`router_generate_content`	Structured content generation	`language`, `max_tokens`, `response_format`
`router_review_content`	Quality review + correction	`strict_mode`, `language`
`router_filter_content`	Content filtering / validation	`criteria_key`, `pass_threshold`
`router_plan_content`	Editorial / batch planning	`strategy` (editorial, chronological, priority), `max_items`
`router_match_semantic`	Semantic similarity matching	`threshold`, `max_candidates`
`router_rlm_process`	Massive context RLM execution (50k+ items)	`rlm_model`, `rlm_environment`, `max_iterations`, `max_timeout`

RAG Pipeline Tools

Binding	Capability
`router_extract_query`	Query extraction + state initialisation
`router_detect_intent`	Intent classification with taxonomy enrichment
`router_retrieve`	Full RAG pipeline (vector + rerank + graph)
`router_ground`	Google Search grounding
`router_generate`	RAG response generation with citations
`router_reflect`	Self-evaluation and quality scoring
`router_suggest`	Search suggestions
`router_no_results`	Empathetic no-results response

SQL Analytics Tools

Binding	Capability
`router_sql_plan`	SQL generation from natural language
`router_sql_execute`	SQL execution with safety limits
`router_sql_verify`	Result verification
`router_sql_visualize`	Output formatting (table, chart, markdown)

Massive Context Tools (RLM)

Recursive Language Models are a task-agnostic inference paradigm that wraps any base LLM with a REPL environment. Instead of cramming 50k items into a context window, RLM stores them as Python variables and lets the model programmatically examine, filter, and recursively call itself.

router_rlm_process is the universal platform tool for this capability. It is not domain-specific — any project can use it for:

Product matching — 50k supplier→10k site (how Commerce uses it today)
Taxonomy classification — 1000+ category trees navigated via code
Bulk deduplication — entity resolution across large datasets
Large-scale analysis — any task where context degradation kills regular LLMs

How Commerce uses RLM for matching:

The commerce matcher (rlm_bulk_matcher graph) runs a brand-by-brand BiDi loop:

Fetch matchable brands via BiDi → consumer exports brand list
For each brand: fetch supplier + site products via BiDi (small payload per brand)
Call router_rlm_process with matching prompt + product data → RLM generates Python code to compare products, index by SKU/brand, and recursively resolve ambiguous cases
Upload results via BiDi → consumer persists matches

The architecture separates concerns cleanly:

router_rlm_process (platform) — pure RLM execution, any task, any model
Commerce rlm_bulk_match_node (domain) — BiDi orchestration, brand iteration, taxonomy/manual-match injection, result upload. This is the domain logic that stays in the commerce extension.

RLM model selection uses the same hierarchy as regular models:

graph:
  rlm_bulk_matcher:
    template: "yaml:rlm_bulk_matcher"
    overrides:
      defaults:
        model: rlm/gpt-5-mini            # default for all nodes
      nodes:
        rlm_process:
          config:
            rlm_environment: docker       # isolated REPL
            max_iterations: 15
            max_timeout: 300

The rlm/ prefix tells the model registry to wrap the base model (gpt-5-mini, claude-sonnet, gemini) with the RLM REPL layer. Any base model becomes RLM-capable.

Calling RLM: Direct Python API

Use this inside imperative graph nodes (like Commerce’s rlm_bulk_match_node):

from contextunity.router.modules.models import model_registry
from contextunity.router.modules.models.types import ModelRequest, TextPart

# Create RLM-wrapped model — "rlm/" prefix activates the REPL layer
model = model_registry.create_llm(
    "rlm/gpt-5-mini",            # any base model: gpt-5-mini, claude-sonnet, gemini-2.5-flash
    config=config,
    environment="docker",         # local | docker | modal | prime
    verbose=True,
)

# Key insight: data goes into custom_tools, NOT into the prompt.
# RLM stores these as Python variables the model can examine programmatically.
result = await model.generate(
    ModelRequest(
        system="You are a product matching expert. Write Python code to analyze and match products.",
        parts=[TextPart(text=matching_prompt)],
        temperature=0.3,
        max_output_tokens=50000,
    ),
    custom_tools={
        "supplier_products": supplier_list,      # 50k items — Python variable, not prompt text
        "site_products": site_list,              # 10k items — Python variable
        "taxonomies": taxonomy_dict,             # optional context
        "out_path": "/tmp/rlm_matches.json",     # output file for structured results
    },
)
# result.text contains the RLM's final output
# result.usage contains token consumption across all recursive calls

Calling RLM: YAML Template Node

Use router_rlm_process as a platform tool in a YAML template:

name: bulk_classifier
version: "1.0"
description: >
  Classify 1000+ products against a deep taxonomy tree using RLM.

defaults:
  model: rlm/gemini-2.5-flash     # RLM wraps Gemini for recursive classification

nodes:
  - name: fetch_products
    type: federated
    tool_binding: export_products_for_classification

  - name: classify
    type: platform
    tool_binding: router_rlm_process
    model: rlm/gpt-5-mini          # per-node override if needed
    config:
      rlm_environment: docker
      max_iterations: 15
      max_timeout: 300
      task_type: classification     # hint for prompt assembly

  - name: write_results
    type: federated
    tool_binding: update_classified_products

edges:
  - from: __start__
    to: fetch_products
  - from: fetch_products
    to: classify
  - from: classify
    to: write_results
  - from: write_results
    to: __end__

Calling RLM: Multi-Graph Manifest

router:
  default_graph: retrieval_augmented

  graph:
    rlm_bulk_matcher:
      template: "yaml:rlm_bulk_matcher"
      overrides:
        defaults:
          model: rlm/gpt-5-mini
        nodes:
          rlm_process:
            config:
              rlm_environment: docker
              max_iterations: 15
      federated_tools:
        export_unmatched_products:
          handler: "commerce.matcher.export_unmatched"
        export_site_products:
          handler: "commerce.matcher.export_site"
        export_taxonomies:
          handler: "commerce.matcher.export_taxonomies"
        bulk_link_products:
          handler: "commerce.matcher.bulk_link"

Then trigger from the UI or API:

# Consumer-side: triggers the registered RLM graph
await client.execute_agent(
    graph="rlm_bulk_matcher",
    payload={"dealer_code": "acme", "target_brand": "SALOMON"},
)

Service Integration Tools

Binding	Service	Capability
`brain_search`	Brain	Vector search via Brain gRPC
`brain_memory_read` / `brain_memory_write`	Brain	Conversation memory
`shield_scan`	Shield	Input scanning for prompt injection
`worker_start_workflow` / `worker_execute_code`	Worker	Durable workflow execution / sandbox code
`zero_scan_pii`	Zero	PII detection and redaction
`language_tool_check`	Language	Grammar / spelling verification
`router_web_search`	Router	Web search grounding

Model Resolution

Models are resolved through a 3-level hierarchy — the first non-empty value wins:

Per-node model  →  Graph defaults.model  →  Router CU_ROUTER_DEFAULT_LLM

Level 1: Per-node `model`

Override the model for a specific node:

nodes:
  - name: classify
    type: platform
    tool_binding: router_classify
    model: vertex/gemini-2.5-flash         # ← this node uses Gemini

Level 2: Graph `defaults.model`

Default model for all nodes in the template that don’t specify their own:

defaults:
  model: openai/gpt-4o                    # ← default for all nodes
  temperature: 0.2

Level 3: Router default

If neither the node nor the template specifies a model, the Router’s global default is used:

export CU_ROUTER_DEFAULT_LLM="openai/gpt-5-mini"

Practical example: Commerce multi-graph

graph:
  gardener:
    template: "yaml:gardener"
    overrides:
      defaults:
        model: vertex/gemini-2.5-flash      # fast + cheap for classification

  enricher:
    template: "yaml:enricher"
    overrides:
      defaults:
        model: openai/gpt-4o               # quality for content generation
      nodes:
        review:
          model: anthropic/claude-sonnet-4  # Claude is strong at review

  rlm_bulk_matcher:
    template: "yaml:rlm_bulk_matcher"
    overrides:
      defaults:
        model: rlm/gpt-5-mini              # RLM wraps gpt-5-mini with REPL

Each graph selects the best model for its workload. Within a graph, individual nodes can override further. The consumer never touches Router configuration — everything is in the manifest.

Secret resolution

Models that need API keys follow the same 3-level pattern with model_secret_ref:

nodes:
  - name: process
    type: llm
    model: openai/gpt-4o
    model_secret_ref: CU_ROUTER_OPENAI_KEY    # env var holding the API key

When Shield is available, secrets are resolved via PutSecret/GetSecret. When unavailable, they fall back to the project’s os.environ. See the Security documentation for details.

Writing a Template

A template is a YAML file in cortex/templates/:

name: my_pipeline
version: "1.0"
description: >
  My custom pipeline description.

defaults:
  model: openai/gpt-4o
  temperature: 0.2

nodes:
  - name: extract
    type: platform
    tool_binding: router_extract_query
    config:
      output_mode: direct

  - name: process
    type: platform
    tool_binding: router_classify
    config:
      output_mode: direct
      taxonomy_key: my_taxonomy
      response_format: json

  - name: export
    type: federated
    tool_binding: my_export_tool
    config:
      output_mode: direct

edges:
  - from: __start__
    to: extract
  - from: extract
    to: process
  - from: process
    to: export
  - from: export
    to: __end__

config:
  max_retries: 2
  timeout: 120

Conditional Edges

Use condition_key and condition_map for routing:

edges:
  - from: detect_intent
    condition_key: intent_route
    condition_map:
      retrieve: retrieve
      sql_analytics: plan
      no_results: no_results

The node must set state["intent_route"] to one of the map keys. The compiler generates a LangGraph conditional edge function automatically.

Consumer Overrides (Single Graph)

When a consumer project connects via contextunity.project.yaml with a single graph, it can override template config values:

router:
  graph:
    template: "yaml:retrieval_augmented"
    overrides:
      defaults:
        model: "anthropic/claude-sonnet-4"
      nodes:
        detect_intent:
          config:
            temperature: 0.1

Overrides are merged at compile time. The template’s frozen config schemas validate the final merged result — unknown fields are rejected.

Multi-Graph Manifest

Projects requiring multiple entry points (e.g., commerce with gardener, enricher, matcher, writer) declare multiple entries in the graph map:

router:
  default_graph: retrieval_augmented

  graph:
    gardener:
      template: "yaml:gardener"
      overrides:
        defaults:
          model: "vertex/gemini-2.5-flash"
      federated_tools:
        export_products_for_normalization:
          handler: "commerce.gardener.export_products"
        run_deterministic_pass:
          handler: "commerce.gardener.deterministic_normalize"
        update_normalized_products:
          handler: "commerce.gardener.update_products"

    enricher:
      template: "yaml:enricher"
      federated_tools:
        export_product_for_enrichment:
          handler: "commerce.enricher.export_context"

    retrieval_augmented:
      template: "yaml:retrieval_augmented"
      federated_tools:
        search_catalog:
          handler: "commerce.search.catalog_search"

Key properties:

Property	Meaning
`default_graph`	Graph used by `StreamDispatcher()` when caller doesn’t specify
`graph.<name>.template`	YAML template to compile
`graph.<name>.overrides`	Per-graph config overrides (merged at compile time)
`graph.<name>.federated_tools`	Tools scoped to this graph ONLY — not shared across graphs

How it works:

At RegisterTools, Router loads each graph’s template and validates that all type: federated tool bindings have matching handler declarations — fail-fast on mismatch
Each graph is compiled independently with its own overrides
Consumer calls ExecuteAgent(graph="gardener") or ExecuteAgent(graph="enricher") — Router resolves the pre-compiled graph by (tenant_id, graph_name)
StreamDispatcher() uses the default_graph

Tool isolation: Federated tools in the gardener graph are NOT available in the enricher graph. This prevents accidental cross-graph tool leakage and ensures each graph’s security boundary is self-contained.

Security Model

SecureNode Wrapping

Every compiled node is wrapped via make_secure_node():

Token extraction from state
Scope verification against the tool’s required_scopes
Token attenuation — each node receives a scoped-down token with only the permissions it needs
Execution — the node runs with attenuated permissions
Error boundary — exceptions are caught and wrapped in typed errors

Worker Callbacks and ExecuteNode (Isolated Execution)

Worker processes (such as Temporal workflows executing async batch operations) sometimes need to temporarily suspend execution, send a specific node configuration back to Router for contextual LLM processing, and wait for its completion. This is handled via the gRPC endpoint ExecuteNode.

To ensure callers cannot blindly execute any node they want out of context, the ExecuteNode architecture enforces an explicit allow-list:

Manifest Exposure: A graph manifest must explicitly define which nodes are safe for standalone invocation by specifying them in the router_callbacks list at the graph root level.
```
router:
  graph:
    gardener:
      template: yaml:gardener
      router_callbacks: ["classify"] # Only the 'classify' node can be called directly
```
Access Security: The client must supply an attenuated token bearing the router:execute_node scope and validating proper tenant_id bindings.

Config Immutability

Template configs are validated through Pydantic models with:

frozen=True — prevents runtime mutation
extra="forbid" — rejects unknown fields (blocks injection attacks)
Bounded fields (ge, le) — prevents resource exhaustion

Compilation Safety

The compiler enforces:

Cycle detection — DAG validation prevents infinite loops
Phantom node rejection — edges cannot reference undefined nodes
Type validation — only platform, federated, and llm node types are accepted
Prefix validation — tool bindings must start with a known service prefix

File Layout

cortex/graphs/compiler/
├── builder.py              # build_from_template(), build_custom_graph()
├── template_loader.py      # YAML parsing, Pydantic validation
├── platform_registry.py    # PlatformToolRegistry singleton
├── validation.py           # DAG validation, cycle detection
├── platform_tools/         # 22 individual modules (one per capability)
│   ├── __init__.py         # register_all_platform_tools() — wires 29 bindings
│   ├── extract.py          # router_extract_query
│   ├── intent.py           # router_detect_intent
│   ├── retrieve.py         # router_retrieve
│   ├── ground.py           # router_ground
│   ├── generate.py         # router_generate
│   ├── reflect.py          # router_reflect
│   ├── suggest.py          # router_suggest
│   ├── no_results.py       # router_no_results
│   ├── synthesizer.py      # router_sql_plan
│   ├── visualize.py        # router_sql_visualize
│   ├── formatter.py        # router_sql_format
│   ├── memory.py           # router_memory_*
│   ├── content.py          # 7 universal content tools
│   ├── rlm.py              # router_rlm_process
│   ├── language.py         # language_tool_check
│   ├── brain.py            # brain_search, brain_memory_*
│   ├── shield.py           # shield_scan
│   ├── worker.py           # worker_start_workflow, worker_execute_code
│   ├── zero.py             # zero_scan_pii
│   └── web_search.py       # router_web_search

cortex/templates/
├── retrieval_augmented.yaml
├── gardener.yaml
├── enricher.yaml
├── rlm_bulk_matcher.yaml
└── news_pipeline.yaml