Production-grade semantic exploration platform with advanced caching, real-time monitoring, and enterprise-grade security features.
- 📄 Multi-format Support - PDF, Microsoft Office, OpenDocument, HTML, XML, plain text
- 🔄 Async Job Processing - Background workers handle extraction, embedding, visualization via NATS JetStream
- 📊 Structured Datasets - Automatic chunking, deduplication, metadata extraction
- 🎯 Custom Transforms - Collection, Dataset, and Visualization pipeline stages
- 🔍 Semantic Search - Vector similarity with Qdrant, metadata filtering, side-by-side model comparison
- 📈 UMAP/HDBSCAN Clustering - produce visualizations of embedding spaces
- 🧠 Multi-LLM Support - Compare results across Cohere, OpenAI, Anthropic, etc.
- 🔐 OIDC Authentication - OpenID Connect with Dex integration, automatic token refresh
- 🛡️ Row-Level Security (RLS) - Database-level access control via PostgreSQL policies
- 🔒 End-to-End Encryption - AES-256 encryption for sensitive data at rest
- 📝 Comprehensive Audit Logging - All operations logged to audit trail with immutable records via NATS
- 🗄️ PostgreSQL with Replication - Primary + read replicas for high availability
- 📦 S3-compatible Storage - AWS S3, MinIO, or any S3-compatible provider
- 🔴 Redis Cluster - Caching, rate limiting, session management with automatic failover
- 📍 Qdrant Vector DB - Production-grade vector search with quantization (product/scalar)
- 📊 Prometheus Metrics - Real-time metrics collection (error rates, latency, throughput, costs)
- 📈 Grafana Dashboards - Business metrics, performance tracking, cost monitoring, SLO dashboards
- 🔍 OpenTelemetry Tracing - Distributed tracing across all services via Quickwit
- ⚡ SLO Tracking - Automated tracking of availability, latency, and error rate SLOs
- ⚙️ Connection Pooling - Tuned for high concurrency with prepared statement caching
- 💾 Query Result Caching - Smart caching with TTL-based invalidation
- 🎯 Quantized Embeddings - Product quantization for 10x faster nearest-neighbor search
- 🔄 HTTP Caching - ETag-based cache validation, conditional requests
- 🎁 Request Deduplication - Prevents duplicate processing of identical requests via Redis
- 👤 Multi-session Support - Multiple concurrent sessions per user with limits
- 🔄 Automatic Token Refresh - Seamless token rotation without user interaction
- ⏱️ Configurable Timeouts - Session and token refresh thresholds
- 📊 Session Analytics - Track session duration, devices, locations
- Docker & Docker Compose
- PostgreSQL 14+ (or use Docker)
- Redis 7+ Cluster mode (or use Docker)
- Qdrant 1.8+ (or use Docker)
- Rust 1.75+ (for local development)
- Node.js 18+ (for UI development)
# Clone repository
git clone <repo-url>
cd semantic-explorer
# Copy environment template
cp crates/api/.env.example crates/api/.env
# Edit crates/api/.env with your configuration
# Start infrastructure (PostgreSQL, Redis, Qdrant, NATS, etc.)
cd deployment/compose
docker-compose -f compose.dev.yaml up -d
# Run database migrations
cd ../../crates/api
sqlx migrate run --database-url "$DATABASE_URL"
# Start API server (Terminal 1)
cd ../../crates/api
cargo run
# Start UI (Terminal 2)
cd ../../semantic-explorer-ui
npm install
npm run dev
# Start worker services (Terminal 3, 4, 5)
# Terminal 3: Collections worker
cd ../../crates/worker-collections
cargo run
# Terminal 4: Datasets worker
cd ../../crates/worker-datasets
cargo run
# Terminal 5: Visualizations worker
cd ../../crates/worker-visualizations-py
source venv/bin/activate
python src/main.py- API: http://localhost:8000 (API docs at
/api/openapi.json) - UI: http://localhost:5173
- Prometheus: http://localhost:9090
- Grafana: http://localhost:3000 (default: admin/admin)
- Qdrant: http://localhost:6334
- API Crate - REST API, middlewares, auth
- Core Library - Shared utilities, config, encryption
- Collections Worker - Document extraction
- Datasets Worker - Embedding generation
- Visualizations Worker - UMAP clustering
- UI - Frontend implementation
- Deployment Guide - Production deployment steps
- Docker Compose - Infrastructure as code
- Helm Charts - Kubernetes deployment
semantic-explorer/
├── crates/
│ ├── api/ # REST API server (Actix-web)
│ │ ├── src/
│ │ │ ├── api/ # HTTP endpoints & handlers
│ │ │ ├── auth/ # OIDC authentication
│ │ │ ├── chat/ # LLM chat endpoints
│ │ │ ├── collections/ # Collection management
│ │ │ ├── datasets/ # Dataset operations
│ │ │ ├── embedding/ # Embedding generation
│ │ │ ├── embedders/ # LLM model drivers
│ │ │ ├── llms/ # LLM integrations
│ │ │ ├── search/ # Semantic search
│ │ │ ├── storage/ # Database & S3 layers
│ │ │ ├── transforms/ # Pipeline transforms
│ │ │ ├── middleware/ # Auth, caching, rate limiting
│ │ │ ├── audit.rs # Audit logging
│ │ │ └── main.rs # Server entry point
│ │ └── Dockerfile # Container image
│ │
│ ├── core/ # Shared library
│ │ ├── config.rs # Configuration management
│ │ ├── encryption.rs # AES-256 encryption
│ │ ├── http_client.rs # HTTP utilities
│ │ ├── models.rs # Domain models
│ │ ├── nats.rs # NATS client
│ │ ├── storage.rs # S3 client
│ │ ├── observability.rs # OpenTelemetry setup
│ │ └── worker.rs # Worker patterns
│ │
│ ├── worker-collections/ # Document extraction worker
│ │ ├── extract/ # Document parsing
│ │ └── chunk/ # Text chunking
│ │
│ ├── worker-datasets/ # Embedding generation worker
│ │ └── embedder.rs # Embedding logic
│ │
│ └── worker-visualizations-py/ # Python UMAP worker
│ ├── processor.py # Clustering logic
│ ├── storage.py # Result persistence
│ └── llm_namer.py # LLM naming service
│
├── semantic-explorer-ui/ # Svelte frontend
│ └── src/
│ ├── lib/ # Shared components
│ ├── App.svelte # Root component
│ └── main.ts # Entry point
│
├── deployment/
├── compose/ # Docker Compose configs
├── helm/ # Kubernetes Helm charts
└── DEPLOYMENT_GUIDE.md # Deployment instructions
- Language: Rust 1.75+
- Web Framework: Actix-web (async HTTP)
- Database: PostgreSQL 14+ with RLS & replication
- Vector DB: Qdrant (quantized embeddings)
- Cache: Redis Cluster
- Message Queue: NATS JetStream
- Authentication: OIDC (Dex)
- Storage: S3-compatible (AWS S3, MinIO)
- Observability: OpenTelemetry, Prometheus
- Framework: Svelte 5
- Build Tool: Vite
- Language: TypeScript
- Styling: Tailwind CSS
- Containerization: Docker
- Orchestration: Docker Compose (dev) / Kubernetes + Helm (prod)
- Monitoring: Prometheus + Grafana
- Tracing: Quickwit
- CI/CD: GitHub Actions
Database & Storage:
DATABASE_URL=postgresql://user:pass@localhost:5432/db
DATABASE_REPLICA_URLS=postgresql://user:pass@replica:5432/db # Optional read replicas
REDIS_CLUSTER_NODES=redis-1:6379,redis-2:6379,...
QDRANT_URL=http://localhost:6334
QDRANT_QUANTIZATION_TYPE=product # product, scalar, or noneSecurity & Auth:
OIDC_CLIENT_ID=your-client-id
OIDC_CLIENT_SECRET=your-secret
OIDC_ISSUER_URL=https://dex.example.com
ENCRYPTION_KEY=your-32-char-encryption-key
ENABLE_RLS=trueFeatures:
SESSION_MAX_CONCURRENT=5 # Max sessions per user
SESSION_TIMEOUT_MINUTES=30 # Session expiration
SESSION_REFRESH_THRESHOLD_MINUTES=5 # Refresh before expiry
ENABLE_SESSION_TRACKING=true # Track session events
ENABLE_QUERY_CACHING=true # Cache semantic search results
ENABLE_HTTP_CACHING=true # Cache HTTP responses
ENABLE_AUDIT_LOGGING=true # Enable audit trail
AUDIT_RETENTION_DAYS=90 # How long to keep audit logs
ENCRYPTION_KEY=your-key # AES-256 encryption key
ENABLE_RLS=true # Row-level security
MAX_FILE_SIZE_MB=100 # Max file size for processing (default: 100MB)Observability:
PROMETHEUS_SCRAPE_PORT=9090 # Metrics export port
PROMETHEUS_SCRAPE_INTERVAL=15s # Scrape interval
OPENTELEMETRY_ENABLED=true
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
QUICKWIT_URL=http://localhost:7280
LOG_LEVEL=info # Logging level
RUST_LOG=semantic_explorer=debug # Detailed loggingSee .env.example for complete configuration options.
The API exports metrics at the configured PROMETHEUS_SCRAPE_PORT at /metrics:
- Request Metrics: Request counts, duration, latency percentiles
- Error Metrics: Error rates by endpoint and status code
- Database Metrics: Query performance, connection pool usage
- Cache Metrics: Cache hit/miss rates
- Business Metrics: Documents processed, embeddings generated
The following dashboards are pre-configured:
- Overview - System health, uptime, error rates
- Business Metrics - User engagement, data processed, transforms
- Performance - Latency percentiles, cache hit rates
- Costs - API costs by model, storage usage
- Database - Replication lag, query performance, RLS impact
- Transforms - Queue depth, processing time, success rates
Access Grafana at http://localhost:3000 (default: admin/admin)
Enable OpenTelemetry for end-to-end tracing:
// Traces are automatically collected and sent to Quickwit
// View in Quickwit UI at http://localhost:7280Query trace data using Quickwit's query language for performance analysis.
- OIDC Integration - Secure authentication via Dex or any OIDC provider
- JWT Tokens - Secure token-based API access
- Token Refresh - Automatic refresh without user interaction
- Multi-session Support - Multiple concurrent user sessions
- Row-Level Security - PostgreSQL RLS policies enforce user isolation
- End-to-End Encryption - AES-256 encryption for sensitive fields
- Encrypted Storage - S3 encryption at rest
- Audit Logging - Immutable audit trail of all operations
- Rate Limiting - Token-bucket algorithm via Redis
- CORS Configuration - Configurable cross-origin policies
- HTTPS/TLS - Full TLS support in production
- Secrets Management - Environment-based secret injection
cd deployment/compose
docker-compose -f compose.dev.yaml upIncludes: PostgreSQL, Redis, Qdrant, NATS, Prometheus, Grafana, Quickwit, Dex
helm install semantic-explorer deployment/helm/semantic-explorer \
--namespace semantic-explorer \
--values values.yamlFeatures: Auto-scaling, health checks, persistent volumes, network policies, RBAC
See DEPLOYMENT_GUIDE.md for detailed steps.
cargo test --libcargo test --test '*' -- --test-threads=1npm run test --prefix semantic-explorer-ui- Create a feature branch:
git checkout -b feature/my-feature - Make changes and ensure all tests pass:
cargo test && npm test - Format code:
cargo fmt - Run linter:
cargo clippy - Submit pull request with description
See LICENSE file for details.
- Issues: GitHub Issues for bug reports and feature requests
- Discussions: GitHub Discussions for questions and ideas