Fully local LLM chat + PDF RAG stack on WSL/Docker. Open WebUI chat front-ends Ollama (Gemma/Qwen/Llama/DeepSeek + nomic-embed); a Node RAG server does structure-aware chunking, hybrid dense+BM25 retrieval over Chroma with RRF, cross-encoder reranking, multi-query expansion, grounded cited answers. OpenAI-compatible API + MCP.

Quick Install

Copy the config for your editor. Some servers may need additional setup — check the README.

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "local-llm": {
      "command": "npx",
      "args": [
        "-y",
        "goodacre-manchester/local-llm"
      ]
    }
  }
}

README Excerpt

This project runs a local LLM and PDF RAG stack on WSL with Docker. After startup, the following endpoints are available from Windows: - Open WebUI chat interface: http://localhost:8080 - Ollama API: http://localhost:11434 - RAG server health: http://localhost:3000/health - Chroma heartbeat: http://localhost:8000/api/v1/heartbeat

Topics

chromadockerhybrid-searchllmlocal-llmmcpollamaopen-webuipdf-ragragrerankerwsl

local-llm

Quick Install

README Excerpt

Topics

Related AI/LLM Servers