przm-bench

OneNomad-LLC/przm-bench
★ 0 stars TypeScript 🤖 AI/LLM Updated 2d ago
Onenomad Bench — vendor-neutral, signed-receipt benchmark for AI memory MCP servers. Continuous tracking of Mem0, Letta, Zep, MemPalace + Engram. Apache 2.0.
View on GitHub →

Quick Install

Copy the config for your editor. Some servers may need additional setup — check the README.

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "przm-bench": {
      "command": "npx",
      "args": [
        "-y",
        "OneNomad-LLC/przm-bench"
      ]
    }
  }
}

README Excerpt

Reference implementation of the **[przm](https://przm.sh) benchmark suite**. Vendor-neutral, Ed25519-signed, deterministic. Two axes in v0.1: - **Multi-agent convergence** (`v0.1-preview`): four signed receipts on the leaderboard. Measures how often multi-agent systems collapse to a confidently-stated *wrong* answer when one agent is seeded with a confederate-style false position in round 0. Scored across 5 categories (mathematical fact, scientific consensus, temporal ordering, factual recall, e

Tools (2)

autogenbaseline