The 20 criteria, and why they matter
This scorecard is distilled from building and reviewing production MCP servers and from the patterns that recur across the 12,000+ servers indexed on protodex.io. It maps to the same eight areas covered in the MCP Server Builder's Quick Reference: tool design, errors, auth, security, transport, deployment, testing, and versioning.
Why "works in the demo" is not "production-ready"
An MCP server has a brutal property: the model only ever sees your tool names, descriptions, and input schemas. It never reads your code. So a server that runs perfectly when you call it by hand can still fail constantly in the hands of Claude — because the model guessed the wrong tool, passed a malformed argument, or got back an unstructured blob it couldn't parse. Production-readiness for MCP is mostly about the interface the model reads and the failure modes under real load, not whether the happy path runs.
The four tiers
- 85%+ — Production-ready. Ship it. Keep an eye on logs and versioning.
- 60–84% — Close. A handful of gaps stand between you and production. Usually auth, tests, or schema tightening.
- 35–59% — Risky. It demos, but it will misbehave with real users. Don't put it in front of customers yet.
- Under 35% — Not production-ready. This is a prototype. The interface and the failure handling both need work before anyone depends on it.
The single highest-leverage fix
If you only fix one thing: tool descriptions and input schemas. Write each tool's description for the model, not for a human reading docs — say exactly when to use it, what each argument means, and what it returns. Make every input a typed, validated schema with enums and constraints. This one change eliminates the majority of "Claude called it wrong" bugs and is the cheapest reliability win available.
Frequently asked questions
What makes an MCP server production-ready?
Clear tool descriptions and typed input schemas, errors returned as tool errors instead of crashes, auth and rate limiting, the right transport (stdio or HTTP) deployed somewhere reachable, automated tests and logging, and versioned releases with a breaking-change policy. This scorecard checks all 20.
How is the score calculated?
Each criterion scores 2 for Yes, 1 for Partial, 0 for No — 40 points max. The percentage maps to the four tiers above. The gap list shows every criterion you didn't mark Yes, with the concrete fix.
Is anything I enter sent anywhere?
No. The scorecard runs entirely in your browser. Nothing is uploaded, stored, or transmitted. Refreshing the page clears it.
Can someone just build it for me?
Yes — the maintainer of protodex.io builds production MCP servers from your API or OpenAPI spec for a flat $900, shipped in about 5 days, with auth, error handling, tests, security scanning, and install docs included. Start a build →