Skip to content

Matcher — Product Linking

Matcher links supplier (DealerProduct) items to your Oscar catalog. It progressively narrows candidates through cheap deterministic stages before invoking expensive LLM reasoning, so only the hardest cases reach the AI.

Multi-Stage Pipeline

Three stages — exact, normalized, and RLM — to maximize accuracy and minimize cost.

RLM Deep Matching

Mercury-2 Recursive Language Model processes 50k+ products in a single recursive pass.

Gardener Integration

Works on pre-normalized data from Gardener for normalized-vs-normalized comparison.

MCP Tools

find_match_candidates, link_products, unlink_product, bulk_link_products for AI-driven matching workflows.

Three-Stage Pipeline

After Gardener normalizes both sides, Matcher runs three stages:

Stage 1: Exact Match ← EAN, manufacturer_sku, SKU FREE
Stage 2: Normalized Match ← brand + model + type + color FREE
Stage 3: RLM Deep Match ← Mercury-2 recursive reasoning PAID

Expected: Stages 1–2 resolve ~60-70% of matches. Only the remaining 30-40% go to RLM.

Stage 1 — Exact Match

Deterministic exact-key comparison:

KeyExampleConfidence
EAN-13 barcode8891695398938891695398931.0
Manufacturer SKU41200412000.95
EAN + brandBarcode + brand name1.0

Stage 2 — Normalized Match

Uses Gardener-normalized fields for structured comparison:

score = 0.0
if dealer.brand == oscar.brand: score += 0.30
if fuzzy(dealer.model, oscar.model) > 85: score += 0.30
if dealer.product_type == oscar.type: score += 0.15
if dealer.category == oscar.category: score += 0.10
if dealer.mfr_sku == oscar.mfr_sku: score += 0.15
# threshold: ≥0.7 → auto-match, 0.5–0.7 → candidate

Stage 3 — RLM Deep Matching

For products that Stage 1 and 2 couldn’t resolve, the Matcher invokes Mercury-2 (or another configured RLM model) for recursive reasoning.

How RLM works:

  1. Builds a structured prompt with all supplier and site products
  2. Mercury-2 generates Python analysis code in a sandboxed REPL
  3. The model iteratively compares products using multi-factor reasoning
  4. Outputs match pairs with confidence scores
matcher = RLMBulkMatcher(
rlm_model="mercury-2", # Bound via manifest
tenant_id=tenant_id,
)
result = await matcher.match_all(
supplier_products=supplier_list,
site_products=oscar_list,
confidence_threshold=0.7,
taxonomies=taxonomy_data,
manual_matches=confirmed_pairs,
wrong_pairs=rejected_pairs,
)

Inputs to RLM:

  • Supplier products with normalized fields
  • Site (Oscar) products with normalized fields
  • Taxonomy context (categories, colors, sizes)
  • Manual matches (operator-confirmed) as positive examples
  • Wrong pairs (operator-rejected) as negative examples

Match Results

Each match produces:

{
"supplier_id": "12345",
"site_id": "oscar-678",
"confidence": 0.92,
"match_type": "rlm_deep", # exact_ean | exact_sku | normalized | rlm_deep
"reasoning": "Same brand, model name, similar price range"
}

Match Statuses

StatusMeaning
matchedConfirmed match (auto or manual)
pending_reviewHigh confidence but needs operator review
not_matchedNo match found
pendingNot yet processed

MCP Tools

Query Tools

ToolDescription
find_match_candidatesMulti-strategy candidate search: SKU → EAN → name → brand+type
heuristic_semantic_matchFast heuristic matching with brand pre-index, O(N×B) complexity

Mutate Tools

ToolDescription
link_productsLink DealerProduct to Oscar Product, optionally sync price/stock
unlink_productRemove DealerProduct → Oscar Product link
bulk_link_productsBatch link/review multiple matches

Export Tools

ToolDescription
export_unmatched_productsExport up to 50k unmatched products for RLM batch
export_productsExport Oscar products for matching context

Configuration

The Matcher relies on the rlm_bulk_matcher node defined in the project manifest. The model and its secrets are explicitly bound in contextunity.project.yaml:

contextunity.project.yaml
graphs:
nodes:
- id: rlm_bulk_matcher
model: "mercury-2"
model_secret_ref: INCEPTION_API_KEY

API key resolution order:

  1. ContextShield (project-scoped): <tenant>/api_keys/inception
  2. Router execution environment fallback: INCEPTION_API_KEY

Running Matcher

Integrated Flow (Gardener + Matcher)

async def run_matcher_for_brand(brand: str, tenant_id: str):
async with RouterClient(token=token) as client:
# Step 1a: Normalize dealer products
await client.execute_agent("gardener", {
"brand": brand, "source": "dealer", "only_new": True,
})
# Step 1b: Normalize oscar products
await client.execute_agent("gardener", {
"brand": brand, "source": "oscar", "only_new": True,
})
# Step 2: Match normalized data
await client.execute_agent("rlm_bulk_matcher", {
"target_brand": brand,
})

PIM UI

The PIM Matcher view at /pim/matcher/ shows:

  • Match candidates sorted by confidence
  • Accept/reject buttons for operator review
  • Batch operations for bulk approval
  • Match statistics per brand

File Locations

# Router — matching graph
cu/router/cortex/graphs/commerce/matcher/
├── rlm_bulk/
│ ├── matcher.py # RLMBulkMatcher class
│ ├── node.py # LangGraph node entry point
│ ├── prompts.py # Prompt builder for RLM
│ ├── parser.py # Response parser + deduplication
│ ├── fallback.py # Chunked fallback if RLM unavailable
│ └── types.py # BulkMatchResult, MatchItem types
# Commerce — MCP tools + UI
cu/commerce/src/mcp/tools/
├── matching/
│ ├── query.py # find_match_candidates, heuristic_semantic_match
│ └── mutate.py # link_products, unlink_product, bulk_link_products
└── suppliers/
└── export.py # export_unmatched_products