Tiny RL environments + graders + reward-hacking QA for LLMs: a verifiable code track (RLVR) and an open-ended truthfulness track, plus an MCP server. Pure-stdlib core.
Quick Install
Copy the config for your editor. Some servers may need additional setup — check the README.
Claude Desktop
Claude Code
Cursor
Add to claude_desktop_config.json:
{
"mcpServers": {
"llm-rl-playground": {
"command": "uvx",
"args": [
"llm-rl-playground"
]
}
}
}
📋 Copy
Run in terminal:
claude mcp add llm-rl-playground uvx llm-rl-playground
📋 Copy
Add to .cursor/mcp.json:
{
"mcpServers": {
"llm-rl-playground": {
"command": "uvx",
"args": [
"llm-rl-playground"
]
}
}
}
📋 Copy
Or install with pip: pip install llm-rl-playground