Skip to content

Knowledge storage and RAG service with PostgreSQL, vector search, and gRPC API

License

Notifications You must be signed in to change notification settings

ContextUnity/contextbrain

Repository files navigation

ContextBrain

License Python 3.13+ GitHub Docs

⚠️ Early Version: This is an early version of ContextBrain. Documentation is actively being developed, and the API may change.

What is ContextBrain?

ContextBrain is the Knowledge Storage and RAG Service of the ContextUnity ecosystem. It provides:

  • Vector storage with PostgreSQL + pgvector
  • Semantic search with hybrid retrieval (vector + full-text)
  • Knowledge Graph with ltree-based taxonomy
  • Episodic memory for conversation history
  • gRPC API for integration with other ContextUnity services

It acts as a centralized memory backend that ContextRouter and other services use for retrieval and knowledge management.

What is it for?

ContextBrain is designed for:

  • RAG backends — store and retrieve knowledge for LLM applications
  • Product catalogs — taxonomy, enrichment, and semantic search
  • Memory systems — episodic and entity-based memory for AI agents
  • News aggregation — fact storage and deduplication

Typical use cases:

  • Knowledge base backend for chatbots
  • Product enrichment and classification
  • Semantic search over documents
  • Multi-tenant knowledge storage

Key Features

  • 🗄️ Multi-Backend Storage — PostgreSQL with pgvector (primary), Vertex AI Search, DuckDB for testing
  • 🔍 Hybrid Search — combines vector similarity with full-text search and reranking
  • 🌳 Taxonomy & Ontology — ltree-based hierarchical classification with AI-powered categorization
  • 🧠 Memory Types — semantic (knowledge), episodic (conversations), entity (facts)
  • 📡 gRPC Service — production-ready service with streaming support
  • � Multi-Tenant — tenant isolation with ContextToken authorization

Architecture

ContextBrain/
├── service/                    # gRPC service (modular)
│   ├── server.py               # Server setup
│   ├── brain_service.py        # Main service class
│   ├── commerce_service.py     # Commerce operations
│   ├── embedders.py            # Embedding providers
│   └── handlers/               # Domain-specific handlers
│       ├── knowledge.py        # Knowledge management
│       ├── memory.py           # Episodic memory
│       ├── taxonomy.py         # Taxonomy operations
│       ├── commerce.py         # Commerce handlers
│       └── news.py             # News engine handlers
├── storage/
│   ├── postgres/               # PostgreSQL + pgvector (primary)
│   │   ├── store/              # Modular store (mixin pattern)
│   │   │   ├── base.py         # Base connection handling
│   │   │   ├── search.py       # Vector search operations
│   │   │   ├── graph.py        # Graph CRUD operations
│   │   │   ├── episodes.py     # Episodic memory
│   │   │   └── taxonomy.py     # Taxonomy operations
│   │   ├── news.py             # News post storage
│   │   └── schema.py           # Database schema
│   └── duckdb_store.py         # Testing backend
├── payloads.py                 # Pydantic validation models
├── ingestion/
│   └── rag/                    # RAG pipeline, processors
└── core/                       # Config, registry, interfaces

gRPC API

ContextBrain exposes its functionality via gRPC — a high-performance RPC framework. The protocol definitions (.proto files) are defined in ContextCore, the shared kernel of the ContextUnity ecosystem. This ensures type-safe communication between all services.

BrainService provides these operations:

Method Description
QueryMemory Hybrid search (vector + text) for knowledge retrieval
Upsert Store knowledge with embeddings
AddEpisode Add conversation turn to episodic memory
UpsertFact Store entity facts (user preferences, etc.)
UpsertTaxonomy Sync taxonomy entries
GetTaxonomy Export taxonomy for a domain
GetProducts Get products for enrichment
UpdateEnrichment Update product enrichment data
CreateKGRelation Create Knowledge Graph relations
UpsertNewsItem Store news facts
GetNewsItems Retrieve news by criteria
UpsertNewsPost Store generated posts

Quick Start

As Python Library

from contextbrain.storage.postgres import PostgresKnowledgeStore
import asyncio

async def main():
    store = PostgresKnowledgeStore(dsn="postgres://...")
    await store.connect()
    
    # Store knowledge
    await store.upsert_knowledge(
        tenant_id="my_app",
        content="PostgreSQL is a relational database...",
        source_type="document",
        embedding=[0.1, 0.2, ...],  # 1536 dims (OpenAI) or 768 (local)
    )
    
    # Semantic search
    results = await store.search(
        tenant_id="my_app",
        query_embedding=[0.1, 0.2, ...],
        limit=10,
    )

asyncio.run(main())

As gRPC Service

import grpc
from contextcore import brain_pb2, brain_pb2_grpc

channel = grpc.insecure_channel("localhost:50051")
stub = brain_pb2_grpc.BrainServiceStub(channel)

# Query memory
response = stub.QueryMemory(brain_pb2.QueryMemoryRequest(
    tenant_id="my_app",
    query="How does PostgreSQL work?",
    top_k=5,
))
for result in response.results:
    print(result.content)

Installation

pip install contextbrain

# With PostgreSQL support (recommended):
pip install contextbrain[storage]

# With Vertex AI support:
pip install contextbrain[vertex]

Configuration

# Required
export BRAIN_DATABASE_URL="postgres://user:pass@localhost:5432/brain"

# Embeddings (choose one)
export EMBEDDER_TYPE="openai"            # OpenAI text-embedding-3-small (1536 dims)
export EMBEDDER_TYPE="local"             # Local SentenceTransformers (768 dims)
# If not set: auto-selects OpenAI if OPENAI_API_KEY exists, otherwise local

export OPENAI_API_KEY="sk-..."           # Required for OpenAI embeddings

# Optional: Custom OpenAI model
export OPENAI_EMBEDDING_MODEL="text-embedding-3-large"  # 3072 dims

# Optional: Vertex AI
export VERTEX_PROJECT_ID="my-project"
export VERTEX_LOCATION="us-central1"

:::note Database schema must match embedding dimensions (1536 for OpenAI, 768 for local). Run uv run alembic upgrade head after changing embedding provider. :::

Development

Prerequisites

  • Python 3.13+
  • PostgreSQL 16+ with vector and ltree extensions
  • uv package manager

Database Setup

# Create database
createdb brain

# Enable extensions
psql brain -c "CREATE EXTENSION IF NOT EXISTS vector;"
psql brain -c "CREATE EXTENSION IF NOT EXISTS ltree;"

# Initialize schema
uv run python scripts/init_db.py

Running the Service

# Start gRPC server on :50051
uv run python -m contextbrain

Running Tests

uv run pytest tests/ -v

Documentation

ContextUnity Ecosystem

ContextBrain is part of the ContextUnity platform:

Service Role Documentation
ContextCore Shared types and gRPC contracts contextcore.dev
ContextRouter AI agent orchestration contextrouter.dev
ContextWorker Background task execution contextworker.dev

License

This project is licensed under the terms specified in LICENSE.md.

About

Knowledge storage and RAG service with PostgreSQL, vector search, and gRPC API

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages