rai

classeve-public/rai
★ 0 stars Rust AI/LLM Updated 3d ago
CPU-only LLM inference engine in pure Rust — 4-bit quantized models, hand-written AVX2 kernels, speculative decoding, and a local HTTP/MCP server. No GPU, no Python runtime.
View on GitHub → Try with Claude — $10 free →

Quick Install

Copy the config for your editor. Some servers may need additional setup — check the README.

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "rai": {
      "command": "cargo",
      "args": [
        "run",
        "--",
        "rai"
      ]
    }
  }
}

README Excerpt

**A CPU-only LLM inference engine in pure Rust.** RAI runs 4-bit quantized language models with hand-written AVX2 kernels — no GPU, no Python runtime, no PyTorch, no GGML, no BLAS. Load a `.raimodel` file and generate text on any modern x86-64 laptop. Built by [ClassEve](https://classeve.com). Licensed under Apache-2.0.

Topics

avx2cpu-inferencellm-inferencemcp-serverquantizationrust