Fully local LLM chat + PDF RAG stack on WSL/Docker. Open WebUI chat front-ends Ollama (Gemma/Qwen/Llama/DeepSeek + nomic-embed); a Node RAG server does structure-aware chunking, hybrid dense+BM25 retrieval over Chroma with RRF, cross-encoder reranking, multi-query expansion, grounded cited answers. OpenAI-compatible API + MCP.
This project runs a local LLM and PDF RAG stack on WSL with Docker. After startup, the following endpoints are available from Windows: - Open WebUI chat interface: http://localhost:8080 - Ollama API: http://localhost:11434 - RAG server health: http://localhost:3000/health - Chroma heartbeat: http://localhost:8000/api/v1/heartbeat