Skip to content

Knowledge Store

ContextBrain’s knowledge store uses PostgreSQL with pgvector for dense vector embeddings and full-text search for keyword matching.

Storage Architecture

brain-knowledge

Mixin-Based Store

Instead of creating massive, monolithic classes to handle all Postgres operations, ContextBrain utilizes a mixin pattern. This allows developers to easily extend the database capabilities (e.g., adding a new AnalyticsMixin) without modifying the core connection pool logic.

class PostgresKnowledgeStore(
BaseStore, # Connection pool management
SearchMixin, # Vector + full-text search
GraphMixin, # Knowledge graph CRUD
EpisodeMixin, # Episodic memory management
TaxonomyMixin, # Taxonomy operations
):
pass

Upserting Knowledge

from contextunity.core import ContextUnit, brain_pb2_grpc, contextunit_pb2
stub = brain_pb2_grpc.BrainServiceStub(channel)
# Store knowledge — domain data goes in payload
unit = ContextUnit(
payload={
"tenant_id": "my_project",
"content": "PostgreSQL is a relational database...",
"source_type": "document",
"metadata": {"source": "docs", "page": 42},
},
provenance=["client:upsert"],
)
response_pb = stub.Upsert(unit.to_protobuf(contextunit_pb2))

Querying

When querying, ContextBrain expects a ContextUnit. It performs dense vector comparisons (Cosine Similarity) via pgvector and returns a stream of highly relevant context chunks that the Router can inject into the LLM prompt.

unit = ContextUnit(
payload={
"tenant_id": "my_project",
"query": "How does PostgreSQL handle concurrency?",
"top_k": 10,
},
)
# QueryMemory returns a stream of ContextUnit
for result_pb in stub.QueryMemory(unit.to_protobuf(contextunit_pb2)):
result = ContextUnit.from_protobuf(result_pb)
print(result.payload)

Embedding Providers

ProviderModelDimensionsSpeed
OpenAItext-embedding-3-small1536Fast (API)
OpenAItext-embedding-3-large3072Fast (API)
LocalSentenceTransformers768No API needed

Multi-Tenant Isolation

Security is enforced physically at the database level. As defined in the ContextUnity Security Scope, ContextBrain sets PostgreSQL session variables to enforce Row-Level Security (RLS).

Even if a malicious query escapes string formatting, the Postgres Kernel will drop any rows that do not match the required tenant_id and user_id.

async with store.tenant_connection(tenant_id, user_id) as conn:
# All queries automatically filtered by tenant and user
results = await conn.search(query_embedding, limit=10)

The tenant_connection() context manager:

  • Sets app.current_tenant and app.current_user on every connection from the pool
  • Fails closed — empty tenant_id raises ValueError
  • Enforced at the database level via dual-dimensional RLS policies