Knowledge Store

ContextBrain’s knowledge store uses PostgreSQL with pgvector for dense vector embeddings and full-text search for keyword matching.

Storage Architecture

brain-knowledge

Mixin-Based Store

Instead of creating massive, monolithic classes to handle all Postgres operations, ContextBrain utilizes a mixin pattern. This allows developers to easily extend the database capabilities (e.g., adding a new AnalyticsMixin) without modifying the core connection pool logic.

class PostgresKnowledgeStore(
    BaseStore,         # Connection pool management
    SearchMixin,       # Vector + full-text search
    GraphMixin,        # Knowledge graph CRUD
    EpisodeMixin,      # Episodic memory management
    TaxonomyMixin,     # Taxonomy operations
):
    pass

Upserting Knowledge

from contextunity.core import ContextUnit, brain_pb2_grpc, contextunit_pb2

stub = brain_pb2_grpc.BrainServiceStub(channel)

# Store knowledge — domain data goes in payload
unit = ContextUnit(
    payload={
        "tenant_id": "my_project",
        "content": "PostgreSQL is a relational database...",
        "source_type": "document",
        "metadata": {"source": "docs", "page": 42},
    },
    provenance=["client:upsert"],
)
response_pb = stub.Upsert(unit.to_protobuf(contextunit_pb2))

Querying

When querying, ContextBrain expects a ContextUnit. It performs dense vector comparisons (Cosine Similarity) via pgvector and returns a stream of highly relevant context chunks that the Router can inject into the LLM prompt.

unit = ContextUnit(
    payload={
        "tenant_id": "my_project",
        "query": "How does PostgreSQL handle concurrency?",
        "top_k": 10,
    },
)

# QueryMemory returns a stream of ContextUnit
for result_pb in stub.QueryMemory(unit.to_protobuf(contextunit_pb2)):
    result = ContextUnit.from_protobuf(result_pb)
    print(result.payload)

Embedding Providers

Provider	Model	Dimensions	Speed
OpenAI	`text-embedding-3-small`	1536	Fast (API)
OpenAI	`text-embedding-3-large`	3072	Fast (API)
Local	SentenceTransformers	768	No API needed

Multi-Tenant Isolation

Security is enforced physically at the database level. As defined in the ContextUnity Security Scope, ContextBrain sets PostgreSQL session variables to enforce Row-Level Security (RLS).

Even if a malicious query escapes string formatting, the Postgres Kernel will drop any rows that do not match the required tenant_id and user_id.

async with store.tenant_connection(tenant_id, user_id) as conn:
    # All queries automatically filtered by tenant and user
    results = await conn.search(query_embedding, limit=10)

The tenant_connection() context manager:

Sets app.current_tenant and app.current_user on every connection from the pool
Fails closed — empty tenant_id raises ValueError
Enforced at the database level via dual-dimensional RLS policies