Dossier Live RAG System

A production-ready, open-source Live RAG (Retrieval-Augmented Generation) system designed specifically for Frappe documents. Dossier provides real-time document ingestion, intelligent chunking, semantic search, and natural language Q&A capabilities through a modern chat interface.

Live Document Sync Intelligent Chunking Vector Search Natural Language Q&A Production Ready Extensible Architecture

Quick Start

Get Dossier running in minutes:

# Clone the repository
git clone https://github.com/your-org/dossier.git
cd dossier
 
# Copy and configure environment
cp .env.example .env
# Edit .env with your Frappe instance details
 
# Start the complete system
make quick-start
 
# Access the chat interface
open http://localhost:3000

Architecture Overview

Dossier is built as a microservices architecture with clear separation of concerns:

Core Services

🔗 Webhook Handler (Node.js) - Receives and validates Frappe webhooks
📄 Ingestion Service (Python) - Processes documents and manages workflows
🧠 Embedding Service (Python) - Generates vector embeddings using BGE-small
🔍 Query Service (Python) - Handles semantic search and retrieval
🤖 LLM Service (Python) - Generates natural language responses using Ollama
🌐 API Gateway (Python) - Authentication, rate limiting, and request routing
⚛️ Frontend (React) - Modern chat interface with real-time streaming

Infrastructure Components

🐘 PostgreSQL - Configuration and metadata storage
🔴 Redis - Message queuing and caching
🎯 Qdrant - Vector database for semantic search
🦙 Ollama - Local LLM inference engine

Key Features

🚀 Live Document Synchronization

Real-time webhook processing with HMAC signature validation
Automatic document ingestion with exponential backoff retry
Dead letter queue for failed processing and manual review
Support for multiple Frappe doctypes with custom field mapping

🧩 Intelligent Text Processing

Semantic chunking with configurable size and overlap
Metadata preservation during document processing
Batch processing for optimal performance
Graceful handling of various document formats

🔍 Advanced Search & Retrieval

Vector similarity search with sub-2-second response times
Metadata filtering and contextual relevance scoring
Top-k retrieval with configurable parameters
Source highlighting and citation tracking

💬 Natural Language Interface

Streaming responses with real-time user feedback
Context injection from retrieved document chunks
Conversation memory and follow-up question handling
Fallback responses for edge cases and errors

🛡️ Production-Grade Security

JWT authentication with configurable token expiration
Rate limiting to prevent API abuse (100 req/min default)
CORS configuration for secure frontend integration
Input validation and sanitization across all endpoints

📊 Monitoring & Observability

Health checks on all service endpoints (/health)
Prometheus metrics collection (/metrics)
Structured JSON logging with correlation IDs
Distributed tracing for request flow tracking

Performance Characteristics

Metric	Performance
Query Response Time	< 2 seconds
LLM Response Time	< 30 seconds
Embedding Generation	20+ texts/second
Concurrent Users	50+ supported
Memory Usage	< 16GB total system
Storage Efficiency	~1GB per 10K documents

System Requirements

Minimum Requirements

CPU: 4 cores
RAM: 8GB
Storage: 50GB free space
Network: Stable internet connection

Recommended for Production

CPU: 8+ cores
RAM: 16GB+
Storage: 100GB+ SSD
Network: High-speed connection

Next Steps

🚀 Quick Start 🏗️ Architecture Deep Dive ⚙️ Configuration Guide 🚀 Deployment Guide

Community & Support

📖 Documentation: Complete guides and API references
🐛 Issues: GitHub Issues (opens in a new tab) for bug reports
💬 Discussions: GitHub Discussions (opens in a new tab) for questions
**🤝 Contributing

Overview