eval-studio

narmaku/eval-studio
★ 0 stars Python AI/LLM Updated 5d ago
The IDE for AI evaluation — one interactive workspace where the UI adapts to what you're testing: Q&A, RAG, agents, MCP servers, or model comparison.
View on GitHub → Try with Claude — $10 free →

Quick Install

Copy the config for your editor. Some servers may need additional setup — check the README.

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "eval-studio": {
      "command": "uvx",
      "args": [
        "eval-studio"
      ]
    }
  }
}

Or install with pip: pip install eval-studio

README Excerpt

The workspace for building, running, and improving AI evaluations — designed for engineers and subject-matter experts alike. eval-studio goes beyond running AI evaluations. It is a complete workspace for building everything needed to evaluate AI systems successfully: datasets, scoring metrics, evaluation rubrics, and telemetry integrations — then using them seamlessly with any evaluation framework onboarded into the platform.