Matcher — Product Linking

Matcher links supplier (DealerProduct) items to your Oscar catalog. It progressively narrows candidates through cheap deterministic stages before invoking expensive LLM reasoning, so only the hardest cases reach the AI.

Multi-Stage Pipeline

Three stages — exact, normalized, and RLM — to maximize accuracy and minimize cost.

RLM Deep Matching

Mercury-2 Recursive Language Model processes 50k+ products in a single recursive pass.

Gardener Integration

Works on pre-normalized data from Gardener for normalized-vs-normalized comparison.

MCP Tools

find_match_candidates, link_products, unlink_product, bulk_link_products for AI-driven matching workflows.

Three-Stage Pipeline

After Gardener normalizes both sides, Matcher runs three stages:

Stage 1: Exact Match       ← EAN, manufacturer_sku, SKU       FREE
Stage 2: Normalized Match  ← brand + model + type + color     FREE
Stage 3: RLM Deep Match   ← Mercury-2 recursive reasoning    PAID

Expected: Stages 1–2 resolve ~60-70% of matches. Only the remaining 30-40% go to RLM.

Stage 1 — Exact Match

Deterministic exact-key comparison:

Key	Example	Confidence
EAN-13 barcode	`889169539893` ↔ `889169539893`	1.0
Manufacturer SKU	`41200` ↔ `41200`	0.95
EAN + brand	Barcode + brand name	1.0

Stage 2 — Normalized Match

Uses Gardener-normalized fields for structured comparison:

score = 0.0
if dealer.brand == oscar.brand:         score += 0.30
if fuzzy(dealer.model, oscar.model) > 85: score += 0.30
if dealer.product_type == oscar.type:   score += 0.15
if dealer.category == oscar.category:   score += 0.10
if dealer.mfr_sku == oscar.mfr_sku:     score += 0.15
# threshold: ≥0.7 → auto-match, 0.5–0.7 → candidate

Stage 3 — RLM Deep Matching

For products that Stage 1 and 2 couldn’t resolve, the Matcher invokes Mercury-2 (or another configured RLM model) for recursive reasoning.

How RLM works:

Builds a structured prompt with all supplier and site products
Mercury-2 generates Python analysis code in a sandboxed REPL
The model iteratively compares products using multi-factor reasoning
Outputs match pairs with confidence scores

matcher = RLMBulkMatcher(
    rlm_model="mercury-2",    # Bound via manifest
    tenant_id=tenant_id,
)
result = await matcher.match_all(
    supplier_products=supplier_list,
    site_products=oscar_list,
    confidence_threshold=0.7,
    taxonomies=taxonomy_data,
    manual_matches=confirmed_pairs,
    wrong_pairs=rejected_pairs,
)

Inputs to RLM:

Supplier products with normalized fields
Site (Oscar) products with normalized fields
Taxonomy context (categories, colors, sizes)
Manual matches (operator-confirmed) as positive examples
Wrong pairs (operator-rejected) as negative examples

Match Results

Each match produces:

{
    "supplier_id": "12345",
    "site_id": "oscar-678",
    "confidence": 0.92,
    "match_type": "rlm_deep",     # exact_ean | exact_sku | normalized | rlm_deep
    "reasoning": "Same brand, model name, similar price range"
}

Match Statuses

Status	Meaning
`matched`	Confirmed match (auto or manual)
`pending_review`	High confidence but needs operator review
`not_matched`	No match found
`pending`	Not yet processed

MCP Tools

Query Tools

Tool	Description
`find_match_candidates`	Multi-strategy candidate search: SKU → EAN → name → brand+type
`heuristic_semantic_match`	Fast heuristic matching with brand pre-index, O(N×B) complexity

Mutate Tools

Tool	Description
`link_products`	Link DealerProduct to Oscar Product, optionally sync price/stock
`unlink_product`	Remove DealerProduct → Oscar Product link
`bulk_link_products`	Batch link/review multiple matches

Export Tools

Tool	Description
`export_unmatched_products`	Export up to 50k unmatched products for RLM batch
`export_products`	Export Oscar products for matching context

Configuration

The Matcher uses the rlm_bulk_matcher graph declared in the project manifest:

# contextunity.project.yaml — multi-graph manifest
router:
  graph:
    rlm_bulk_matcher:
      template: "yaml:rlm_bulk_matcher"
      toolkits:
        - MatcherTools
      nodes:
        - name: "rlm_process"
          model: "rlm/gpt-5-mini"
          model_secret_ref: "CU_ROUTER_MATCHING_MODEL_KEY"

Secret resolution: Shield (enterprise) → encrypted Redis → env var fallback.

Running Matcher

Integrated Flow (Gardener + Matcher)

async def run_matcher_for_brand(brand: str, tenant_id: str):
    async with RouterClient(token=token) as client:
        # Step 1a: Normalize dealer products
        await client.execute_agent("gardener", {
            "brand": brand, "source": "dealer", "only_new": True,
        })
        # Step 1b: Normalize oscar products
        await client.execute_agent("gardener", {
            "brand": brand, "source": "oscar", "only_new": True,
        })
        # Step 2: Match normalized data
        await client.execute_agent("rlm_bulk_matcher", {
            "target_brand": brand,
        })

PIM UI

The PIM Matcher view at /pim/matcher/ shows:

Match candidates sorted by confidence
Accept/reject buttons for operator review
Batch operations for bulk approval
Match statistics per brand

File Locations

# Commerce — matcher federated tools + MCP tools
extensions/commerce/src/contextunity/commerce/
├── mcp/tools/
│   ├── matcher/
│   │   ├── node.py              # rlm_bulk_match_node (@federated_tool)
│   │   ├── handlers.py          # BiDi export/link handlers
│   │   └── prompts.py           # RLM prompt builder
│   └── matching/
│       ├── query.py             # find_match_candidates, heuristic_semantic_match
│       └── mutate.py            # link_products, unlink_product, bulk_link
├── modules/matcher/workflow.py  # Matcher workflow orchestration
└── pim/views/products/matcher.py  # PIM admin matcher UI

# Router — RLM bulk matcher template
services/router/src/contextunity/router/cortex/templates/
└── rlm_bulk_matcher.yaml        # RLM processing pipeline