Cortex Pipeline

The Cortex is ContextRouter’s AI orchestration layer. All graph logic — RAG, SQL analytics, commerce, and news processing — lives here. Cortex uses the Graph Compiler to transform declarative YAML templates into executable LangGraph state machines.

How Graphs Work

Every graph in Cortex follows the same lifecycle:

Template loading — YAML parsed and validated via Pydantic (TemplateDefinition)
Override merging — consumer project overrides merged with template defaults
Node compilation — each node dispatched via PlatformToolRegistry based on tool_binding
Security wrapping — every node wrapped by make_secure_node() with token scope enforcement
Execution — compiled StateGraph processes messages through the node pipeline

YAML Template → Template Loader → Override Merge → Node Compilation → StateGraph
                                                                          ↓
                                                        Security-Wrapped Execution

Pipeline Flow (RAG)

The default retrieval_augmented template processes queries through:

extract_query → detect_intent ─┬─→ retrieve → suggest ─┐
                                ├─→ ground ──────────────→ reflect → __end__
                                ├─→ generate ────────────→ reflect → __end__
                                ├─→ plan → execute_sql → verify → visualize → reflect
                                └─→ no_results → __end__

Node Types

Type	Description	Example
`platform`	Internal Router capability (LLM, retrieval, classification)	`router_classify`, `router_generate`
`federated`	Executes on consumer project via BiDi gRPC stream	`export_products`, `store_news_results`
`llm`	Direct LLM call with prompt + model config	Custom classifier prompt

Nodes

Each node is a platform tool registered in PlatformToolRegistry. The business logic is preserved in the node implementation files — only the graph wiring is replaced by YAML.

RAG Pipeline Nodes

Node	Tool Binding	What it does
`extract_query`	`router_extract_query`	Reads latest `HumanMessage`, initialises 12 default state keys
`detect_intent`	`router_detect_intent`	LLM intent classification with taxonomy enrichment
`retrieve`	`router_retrieve`	Full RAG pipeline — vector search + reranking + graph facts
`ground`	`router_ground`	Google Search grounding with context assembly
`generate`	`router_generate`	Multi-source prompt assembly with citation formatting
`reflect`	`router_reflect`	Self-evaluation, quality scoring, retry decision
`suggest`	`router_suggest`	Search suggestion generation
`no_results`	`router_no_results`	Empathetic no-results response

SQL Analytics Nodes

Node	Tool Binding	What it does
`plan`	`router_sql_plan`	Generates SQL from natural language question
`execute_sql`	`router_sql_execute`	Executes SQL with timeout + row limits
`verify`	`router_sql_verify`	Validates results for correctness
`visualize`	`router_sql_visualize`	Formats output (table, chart, markdown)

Universal Content Tools

These tools are domain-agnostic — any project can compose them:

Tool Binding	Capability
`router_classify`	Taxonomy / intent classification against a schema
`router_generate_content`	Structured content generation from prompt + context
`router_review_content`	Quality review + correction pass
`router_filter_content`	Content filtering / validation against criteria
`router_plan_content`	Editorial / batch planning from item set
`router_match_semantic`	Semantic similarity matching + reranking

Intent Detection

The detect_intent node uses a fast LLM to classify the user’s query and route to the appropriate pipeline path:

Intent	Description	Route
`rag_and_web`	Questions requiring knowledge retrieval	`retrieve` → `generate`
`sql_analytics`	Data questions answerable via SQL	`plan` → `execute_sql`
`translate`	Translation requests	`generate` (direct)
`summarize`	Summarisation requests	`generate` (direct)
`identity`	Questions about the assistant itself	`generate` (direct)

The node also performs taxonomy enrichment — matching the user query against a canonical_map to detect concepts and strengthen retrieval queries.

Domain Templates

Gardener (Product Normalisation)

__start__ → fetch_products → deterministic_pass → classify → write_results → __end__

Uses router_classify for LLM taxonomy classification. Domain-specific logic (deterministic pass) runs as a federated tool on the consumer side.

Enricher (Product Enrichment)

__start__ → prepare → enrich → review → __end__

Uses router_generate_content for LLM content generation and router_review_content for quality review.

News Pipeline

__start__ → harvest → filter → plan → generate → store → __end__

Uses router_filter_content, router_plan_content, and router_generate_content for the LLM processing stages.

Debugging

Enable pipeline debug logging:

export DEBUG_PIPELINE=1    # Per-node structured logs
export DEBUG_WEB_SEARCH=1  # Raw CSE result previews

Output format:

PIPELINE extract_query | user_query="How does RAG work?"
PIPELINE detect_intent.out | intent=rag_and_web taxonomy_concepts=["RAG"]
PIPELINE retrieve.out | docs=5 books=3 videos=1 qa=1 web=0
PIPELINE generate.out | assistant_chars=450 web_sources=0