I analyzed 932 resolved Polymarket markets — can you actually beat them?
Prediction markets feel beatable. The prices are just crowd guesses, right? I pulled 18.7M price snapshots across 18,600+ Polymarket markets and tested three concrete questions on resolved markets with known outcomes — no look-ahead, no cherry-picking. Here's what the data actually says.
1. Are the prices calibrated?
If a market trades at 30¢, an efficient market resolves YES 30% of the time. I bucketed every resolved market by its price ~6 hours before expiry and compared to the actual YES rate.
The result: the points sit on the diagonal. Across the full sample, mean(actual − price) was about −1.5 percentage points with a z-score around −1.4 — *not* statistically significant. In plain terms: the market is calibrated to within the transaction fee. Any residual gap is roughly the cost to trade it.
That's the signature of an efficient market. The lesson: a naive "this looks mispriced" forecasting edge mostly isn't there.
2. Does "buy the dip" (crash-recovery) work?
The popular thesis: when a market's YES price crashes from high to low, it over-shot — buy the dip. I backtested it honestly, including every market that crashed and stayed dead (the trap most backtests quietly drop).
- ~800 crash entries
- win rate 8%, average entry 8.4¢
- EV per $1 staked: negative, against a break-even hurdle of ~9.4%
The dip is information, not a discount. Markets crash because the outcome genuinely became less likely. Fading that loses money after costs.
3. The trick that *does* add value: read the implied distribution
Here's the part worth keeping. Polymarket lists a *ladder* of crypto strikes for the same date — "BTC above $62k", "above $66k", "above $72k"… The YES price of "above K" is the market-implied survival function P(BTC > K). It must decrease as K rises.
Reconstruct it and you get the market's full belief about where BTC lands — for free. Example from the data (priced 24h before a June expiry):
| Strike | P(BTC > strike) |
| $62,000 | 97% |
| $66,000 | 63% |
| $72,000 | 0.4% |
| $76,000 | 0.15% |
That implies a median around $67,200, and the ladder is internally consistent (no monotonicity violations = no arbitrage that day). When a ladder *isn't* monotonic — a higher strike priced more likely than a lower one — that's a pure arbitrage you can scan for automatically.
The takeaway
As a price-taker at retail size, you won't out-forecast these markets — they're efficient to within fees, six different ways I tested. The value isn't in *predicting* the market; it's in *reading* it: the implied distribution, calibration curves, and clean historical data are genuinely useful for hedging, sizing, and research.
If you want to run these analyses yourself: the full dataset (18.7M snapshots) is here, and the Quant Toolkit bundles it with the notebook that produced every number above — calibration test, crash backtest, and the implied-distribution extractor — so you can point it at any market and get an answer.
*Data from the pipeline behind protodex.io.*