Binary Events: What Happens When You Split One Market Into Twenty

Pricing things that aren't binary

Prediction markets are increasingly being used to price things that aren't binary. "Where will BTC close this month?" isn't a yes/no question. Neither is "What will the Fed rate be?" or "How hot will September be?" These are questions about continuous quantities — numbers on a range — and the answers live somewhere on a spectrum, not at yes or no.

Polymarket is built on binary contracts. To represent a range of outcomes, it must split one question into many separate yes/no markets, each covering a slice of the range, each with its own order book, each trading independently. It does this at scale, and it's the natural approach given the primitive it's built on.

A note on what this is. This is a metadata analysis — event and market-level volume, liquidity, and prices from the Gamma API. No trade-level data, no order book snapshots. We're describing the structural shape of how multi-market events work on Polymarket, not how individual traders behave within them. Saguillo et al. (2025) recently found ~$40M in arbitrage profit from Polymarket's fragmented binary structure — we look at the same architecture from a different angle. Notebooks and data pipeline are here.

We wanted to understand what that splitting actually costs

When Polymarket takes one question and turns it into 10 or 20 or 50 binary contracts, what happens to the liquidity? How many of those contracts actually get used? And what does the structure mean for the precision of information the market can capture? Splitting is a natural way to represent multi-outcome questions using binary contracts, but it raises structural questions we wanted to explore: how does volume actually distribute across these markets? How many of them function? And what could the $0.01 tick size mean at different price levels?

What we hypothesised

We expected to find that splitting one question into many binary contracts creates structural costs:

Volume concentrates in a few markets regardless of how many are created, because traders and liquidity providers naturally gravitate toward the most "interesting" outcomes.
Tail markets become ghost markets with little or no trading activity, especially as N grows, because the $0.01 tick size makes low-probability contracts too coarse to trade on.
The fixed tick size creates disproportionate friction for low-priced contracts, making the information space at the tails of the distribution structurally harder to price than at the centre.
Adding more markets doesn't solve these problems because each new market is another independent order book that needs its own liquidity. The capital doesn't spread, it concentrates.

The dataset

We pulled a large set of Polymarket events via the Gamma API: 36,777 events containing 190,783 individual binary contracts across 6 categories (Politics, Crypto, Sports, Finance, Culture, Weather). Full history from July 2022 to March 2026. We excluded ~20,000 "Up or Down" micro-trading events (5-minute binary bets on crypto price direction) which are always single-market and would have skewed the dataset without adding analytical value.

Dataset — 36,777 events, 190,783 markets, by structure and category
Category	Single-market		Multi · negRisk		Multi · non-negRisk		Total
Category	Events	Markets	Events	Markets	Events	Markets	Events	Markets
Sports	11,672	11,672	4,818	32,336	3,466	23,505	19,956	67,513
Politics	3,622	3,622	2,137	27,937	1,064	12,226	6,823	43,785
Weather	80	80	2,467	20,597	16	85	2,563	20,762
Crypto	937	937	1,010	8,877	1,075	11,546	3,022	21,360
Culture	753	753	1,307	20,084	263	3,036	2,323	23,873
Finance	850	850	342	3,972	898	8,668	2,090	13,490
Total	17,914	17,914	12,081	113,803	6,782	59,066	36,777	190,783

Of our 36,777 events, just over half (18,863) use multi-market structures — i.e. 2 or more binary contracts grouped under one event. These multi-market events contain 172,869 individual markets, which is 90% of the total. The remaining 17,914 single-market events are straightforward yes/no questions.

Within the multi-market events, 64% use what Polymarket calls "negRisk" — mutually exclusive outcomes where exactly one market resolves YES. Think election candidates (only one wins) or temperature brackets (it can only fall in one range). The other 36% are independent markets that can each resolve separately, like "Will BTC reach $X?" threshold sets, where multiple thresholds could be hit.

Sports dominates by event count (8,284 multi-market events), but Politics generates the most structural complexity with a mean of 12.5 markets per event. Weather is interesting — 2,467 negRisk events, almost all numeric bracket structures for temperature and storm data.

How many markets per event?

For every multi-market event, we counted N — the number of constituent binary contracts. This is the "resolution" of the event: how pixelated is it?

Histogram of N, the number of binary markets per event, across all multi-market Polymarket events. Left panel spans N=2 to 159 with a median line at 7; right panel zooms to N less than or equal to 30, showing a dominant spike at N=3 and a secondary peak near N=11.

Figure 1. Distribution of markets per event (N) across all 18,863 multi-market events. Median = 7; the tail stretches to N=159. Right panel zooms to N ≤ 30.

The distribution is heavily right-skewed. There's a dominant spike at N=3 (~5,400 events) — mostly sports matchups and simple three-way questions. A secondary peak around N=10–11 (~2,600 events) reflects standardised bracket templates that Polymarket uses for crypto price thresholds, interest rate decisions, and temperature ranges. The tail stretches to N=159 (a particularly sprawling sports competition).

Box plot of markets per event (N) by category. Sports clusters lowest with a median of 3; Crypto is a near-flat line at 11; Culture and Politics show the widest spread.

Figure 2. N by category. That's not a bug — Crypto sits almost exclusively in the 11–12 range because of a standardised price-threshold template.

Core finding

The median is 7 markets per event. At the 90th percentile, you're at 19.

Markets per event (N) by category
Category	Events	Median N	Mean N	Max N
Culture	1,570	13	14.7	104
Crypto	2,085	11	9.8	30
Finance	1,240	11	10.2	74
Politics	3,201	8	12.5	128
Weather	2,483	7	8.3	11
Sports	8,284	3	6.7	159

What's notable is how each category has its own "resolution standard." Sports events cluster at very low N (median 3) because they're "who wins?" among a small field. Crypto is remarkably compressed — almost every crypto event sits at exactly N=11 because of a standardised price-threshold template. Culture events have the widest variance and highest median (13), reflecting open-ended "who will…?" questions where Polymarket keeps adding potential outcomes.

What's the trend?

Quarterly chart from 2022Q3 to 2026Q1 showing median and mean markets per event (N) rising over time alongside growing event counts; median N climbs from about 5 in early 2023 to about 10 by 2026Q1.

Figure 3. Median and mean N per event by quarter, 2022Q3 – 2026Q1, with event counts as grey bars.

N is growing over time. The median went from ~5 in early 2023 to ~10 in early 2026. The platform is gradually adding more contracts per event — effectively trying to increase the resolution of its multi-market events. Part of this is real (new bracket-style events in crypto and finance), part is category mix (more high-N categories appearing over time).

Whether adding more contracts translates to more useful information is what the rest of this analysis examines.

Where does the volume actually go?

Cumulative share of event volume by market rank for events with N at least 5. The median curve reaches the 90% threshold by rank 5 and flattens; an interquartile band and a uniform-allocation reference line are shown.

Figure 4. Cumulative share of event volume by market rank (events with N ≥ 5, volume > $1K). The median event reaches 90% of volume by rank 5 (red dashed line).

For each of the 16,856 events with volume above $1K, we ranked markets by volume and computed the cumulative share at each rank position. The question is simple: if an event has N markets, how many of them do you need to account for most of the trading activity?

The answer: 5. The median event reaches 90% of its total volume by market rank 5. The interquartile range shows that even at the 25th percentile, 90% is reached by rank 8–9.

Volume concentration ratios — 16,856 events with volume > $1K
Metric	Median	Mean	P90
Top-1 market	49.5%	53.3%	94.9%
Top-3 markets	93.8%	82.6%	100.0%
Top-5 markets	100.0%	91.4%	100.0%

The concentration ratios tell the same story as a single number: the top market alone captures half of all event volume (median 49.5%). The top 3 capture 93.8%. By rank 5, you've typically accounted for everything that matters.

This confirms and extends a finding from Saguillo et al. (2025), who reported 90%+ of liquidity in the top 4 conditions per negRisk market. We see a very similar pattern across a broader dataset — 16,856 events including both negRisk and non-negRisk, across all categories.

What this tells us is that most of the "resolution" in a multi-market event is notional. An event might have 30 contracts, but trading activity concentrates in a handful. The rest exist as order books, but the volume there is minimal.

Scatter plot of top-3 market volume share against N for every event. Points sit far above the dashed uniform (3/N) reference line at every N, and the gap widens as N grows.

Figure 5. Top-3 market share vs N. The red line is uniform allocation (3/N); observed concentration sits far above it at every N.

To understand whether adding more markets helps spread volume more evenly, we plotted top-3 concentration against N. The red line on the chart shows what you'd expect if volume distributed uniformly — at N=10, each market gets 10%, so the top 3 would hold 30%. At N=30, they'd hold 10%.

The data sits far above that line at every value of N. At N=10, the top 3 actually hold 73% (not 30%). At N=20, they hold 66% (not 15%). At N=30, they hold 63% (not 10%). The gap between observed and uniform — the "wasted resolution" — widens as N increases.

In practical terms, adding more markets to an event doesn't distribute volume more evenly. It creates a larger distance between the resolution the event appears to have and the resolution that's actually being used.

An important caveat here. This analysis treats all multi-market events together, but concentration has different implications depending on what kind of event you're looking at. In an election with 17 candidates, Trump and Harris naturally dominating the volume isn't a structural problem — it's just how elections work. Having 15 low-volume long-shot candidates doesn't mean the market is broken; those candidates are genuinely unlikely.

Where the concentration becomes more structurally interesting is in numerical bracket events — BTC price ranges, interest rate decisions, temperature brackets — there, the tails represent real probability mass that should carry information, but the structure may make those tail markets hard to use. We haven't separated categorical from numerical events in this analysis (but will be doing a part 2 to look at this), so the findings above mix both types. The concentration patterns hold across both, but the interpretation differs.

How many markets actually function?

The concentration analysis tells us volume is uneven. But how many markets are actually being used at all?

We defined a "ghost market" as any market within an event with less than $1K in lifetime volume — effectively untouched over its entire existence.

Left: stacked bars of median ghost (under $1K) versus functional market share by event-size tier — near-zero ghosts up to N=10, rising to 36% at N=11–20, 60% at N=21–50, and 72% at N=51 and above. Right: functional market count versus total N, plateauing around 10 to 15.

Figure 6. Median ghost-market rate by event size (left) and functional vs total market count (right). Functional markets plateau around 10–15 regardless of N.

The ghost rate depends heavily on event size.

Small events (N=2–10, which is 11,523 events — the majority) have essentially no ghosts. Every market is functional. If you create 3 markets for an NBA game or 7 brackets for a rate decision, all of them tend to get some trading activity.

The picture changes above N=10. At N=11–20, the median ghost rate is 36% — about a third of markets never see meaningful activity. At N=21–50, it's 60%. At N=51+, it's 72%.

The right panel of the chart shows something perhaps more revealing: the functional market count plateaus. Events with N=15 typically have about 10 functional markets. Events with N=25 have about 11. Events with N=35 have about 12. The number of actually-functioning markets grows much more slowly than the total, and it levels off around 10–15 regardless of how many markets are created.

In aggregate across all active events: of 151,383 total markets, 24% have literally zero volume and only 64% exceed $1K. More than a third of the multi-market infrastructure on Polymarket has never been meaningfully used.

One open question we need to explore: how sensitive is the ghost rate to the volume threshold? At $10K or $100K, the ghost rate would be significantly higher. We used $1K as a conservative baseline — representing "at least some trading occurred."

The rounding tax: a theory

Every Polymarket contract trades in $0.01 increments. This is a fixed architectural choice. The question is: what does that fixed tick size mean for contracts at different price levels?

Simply put, if a contract is priced at $0.50, moving from $0.49 to $0.50 is a 2% change. You can express meaningfully different beliefs one tick apart. That's coarse compared to traditional markets, but it works.

If a contract is priced at $0.02 (a 2% probability event), moving from $0.02 to $0.03 is a 50% jump. There's no way to say "I think this is 2.5% likely." You're forced to choose between 2% and 3%. At $0.01, the only possible prices are $0.01 and $0.02 — a 100% gap.

We call the ratio ($0.01 / contract price) the "rounding tax." It's not something we discovered in the data — it's arithmetic. What the data gives us is the distribution of prices across Polymarket's live markets, which tells us how many contracts actually sit at these different friction levels.

A snapshot, not a historical measure. We computed this across 12,765 markets with live tradeable prices in open events at the time of our data pull (March 31, 2026). Closed markets settle at exactly $0 or $1 and are excluded. This is a point-in-time snapshot — running this again next week would produce different results because prices move. If the distribution turned out to be stable over time, that would itself be an interesting finding, but we haven't tested that.

Left: distribution of the rounding tax (1 cent tick divided by contract price) across open markets, with reference lines at 2%, 10% and 50%. Right: rounding tax versus YES token price on a log scale, sitting far above oil-futures and S&P E-mini tick references.

Figure 7. Distribution of the rounding tax (left) and rounding tax vs contract price (right). Low-priced tail contracts face a tax orders of magnitude above traditional futures.

At the time of this snapshot, the median rounding tax across all open multi-market markets was 14.3%. For markets priced below $0.10, it was 111%.

For context: using recent annualised prices, oil futures have a tick size of ~0.015% of the asset price. The S&P E-mini is ~0.005%. These comparisons aren't apples-to-apples (probability contracts are structurally different from commodity futures), but they illustrate how much coarser the resolution is for low-priced prediction market contracts.

The theory this suggests is that the fixed $0.01 tick creates an information gradient where higher-priced contracts (near $0.50) can express relatively fine-grained beliefs, while low-priced contracts (the tails) are structurally limited to very coarse resolution. Whether this actually suppresses trading activity or information quality in tail markets is an empirical question we can't answer with metadata alone.

How this could be tested: Trade-level data would let us measure how frequently tail markets actually tick (change price) compared to higher-priced markets in the same event, and whether price staleness in tail markets correlates with real-world information that should be moving them. That would turn this structural observation into an empirical finding.

Open structural questions

Two properties of this architecture are worth noting, though we don't have the data to test them empirically in this analysis.

The resolution gap

The $0.01 tick size on a low-priced contract creates a minimum probability change that may be quite large in real-world terms. A BTC price bucket priced at $0.02 can only move in 50% increments ($0.01 to $0.02). How much real-world price movement is required to justify that jump? And does the market stay stale while real information accumulates below the threshold? Testing this would require trade-level data: measuring the frequency of price ticks in tail vs high-priced markets, and comparing against external reference prices (e.g. BTC spot) over the same period.

The precision penalty

Binary contracts pay $1 or $0. Two traders with very different levels of confidence in the same outcome — one precise, one uncertain — buy the same contract at the same price and earn the same payoff. The structure cannot distinguish between them. Whether this matters in practice (do traders actually hold meaningfully different distributions within a single bucket?) is an empirical question. One approach would be to examine traders who buy multiple adjacent buckets in the same event — a crude attempt to express a distribution using binary pieces. Comparing their PnL profiles to single-bucket traders might reveal whether the market is losing information at the discretisation boundary. This requires trade-level data linked across markets within an event.

What we're building differently

At functionSPACE, we're building a prediction market primitive where traders express beliefs as continuous probability distributions rather than binary yes/no positions. Instead of splitting a question into N separate order books, there is one shared surface where the range of outcomes is tradeable.

This matters because as prediction markets mature beyond high-attention binary questions into pricing continuous quantities (economic indicators, asset prices, climate data), the fidelity of the instrument needs to match the complexity of the question. The concentration and ghost market findings above suggest that the binary approach runs into structural limits when N grows — liquidity doesn't spread, tails go dark, and precision is lost.

A distribution-native primitive avoids this by construction: there are no separate buckets to fragment across, no ghost markets to accumulate, and the structure can reward a trader who is precisely right differently from one who is vaguely right.

The analysis in this piece doesn't prove that continuous distributions are better — that's a separate empirical question we're working toward. But it does characterise the structural ceiling that the binary approach hits, and why we think the primitive itself needs to change as the use cases get more demanding.

TL;DR of findings

Volume concentrates in ~5 markets per event regardless of N. The top 3 markets capture 94% of event volume at the median. The cumulative curve flatlines by rank 5. This holds across 16,856 events.
Adding more markets creates more ghost markets, not more resolution. Below N=10, every market functions. Above N=10, ghost rates scale steeply — reaching 72% at N=51+. Functional market count plateaus at 10–15 regardless of total N.
N is growing but concentration isn't improving. The platform is adding more contracts per event over time, but the gap between claimed resolution and used resolution widens with N.
The categorical vs numerical distinction matters, and there are valuable threads to pull. Concentration in election markets is expected behaviour. Concentration in price-bracket markets has different structural implications. Separating these would sharpen the analysis and is the most important follow-up.

These findings describe structural properties of multi-market architectures that use independent binary contracts. Polymarket has obviously built an impressive platform that serves its purpose well — the high-volume, high-attention markets at the centre of each event work. The structural questions arise at the edges: in the tails, in the brackets, in the parts of the probability space where the most nuanced information should live but where the architecture makes it hardest to express.

Notes and limitations

Metadata only, no trade-level data. We used event and market-level volume, liquidity, and prices from the Gamma API. We cannot measure trader behaviour, order book dynamics, or how liquidity changes over a market's lifecycle. The ghost market and concentration findings are based on lifetime volume totals, not activity patterns.

Point-in-time liquidity. The liquidityClob field is a snapshot of the current order book, not historical. For closed markets this is near-zero (drained after settlement). We used volume as the primary metric for distribution analysis, and restricted liquidity-dependent analysis to open events only.

Categorical and numerical events mixed. We analysed all multi-market events together. Concentration in election markets (categorical) is structurally expected and has different implications than concentration in price-bracket markets (numerical). Separating these would sharpen the analysis materially. This is the most important follow-up.

6 categories, not all. We fetched Politics, Crypto, Sports, Finance, Culture, and Weather. Events tagged only as Science, Tech, World, Business, or Entertainment (and not cross-tagged with our 6) are excluded. Spot checks confirmed minimal overlap missed except for Weather, which we added.

Micro-event exclusion. ~20,000 "Up or Down" (5-minute crypto binary bets) and ~60 "In-Game Trading" events were excluded during fetch. These are always single-market and confirmed to never appear as multi-market events.

Rounding tax uses open-event prices only. Closed markets settle at $0 or $1, which are not tradeable prices. The 12,765 markets in the rounding tax analysis are from open events with active prices.

Worked examples are illustrative. The resolution gap and precision penalty sections use real Polymarket parameters but are arithmetic illustrations, not empirical tests. We noted how each could be evidenced with trade-level data in future work.

Appendix

Notebooks and data pipeline

github.com/igorfunctionspace/polymarket-discretisation

How Polymarket represents complex questions

On Polymarket, questions that have more than two possible outcomes are broken into multiple binary (yes/no) contracts (docs.polymarket.com):

Event. A question or topic on Polymarket. "Presidential Election 2024" or "BTC price in March" is an event.
Market. A single binary contract within an event. Each market has its own YES and NO tokens, its own order book, and trades independently. "Will Trump win?" is one market inside the election event.
Single-market event. An event with just one market. A simple yes/no question like "Will TikTok be banned?"
Multi-market event. An event split into 2 or more markets. "Who wins the election?" becomes 17 separate binary contracts, one per candidate, each trading on its own order book.
negRisk. Exactly one market resolves YES. Election candidates and price brackets are typically negRisk. Threshold markets ("Will BTC reach $X?") are not, because multiple thresholds can be hit.

Related work

Saguillo et al. (2025) — "Unravelling the Probabilistic Forest: Arbitrage in Prediction Markets." Analysed 10,237 markets (17,218 conditions) on Polymarket over Apr 2024 – Apr 2025. Found ~$40M in realised arbitrage profit from binary market fragmentation — the structural price incoherence that arises when related outcomes trade on separate order books. Their key finding that 90%+ of liquidity concentrates in the top 4 conditions per negRisk market is the departure point for our analysis. We confirm their concentration result across a broader dataset (16,856 events vs their 1,578 negRisk markets) and look at what happens to the rest of the markets — the ghost market rate, the rounding tax on low-priced contracts, and the structural limits on precision.

Capponi et al. (2025) — "Semantic Trading Clusters and Relationship Prediction in Prediction Markets." Found leader-follower trading patterns across semantically correlated but structurally isolated Polymarket markets, yielding ~20% ROI. Their finding that related markets don't stay in sync is consistent with the structural fragmentation described here — each market is its own island, and information doesn't flow between them as efficiently as it would in a unified instrument.

Igor leads Research at @functionspaceHQ — an open-source project exploring market-led resolution and composable belief instruments for prediction markets.

Binary Events: What Happens When You Split One Market Into Twenty

Pricing things that aren't binary

We wanted to understand what that splitting actually costs

What we hypothesised

The dataset

How many markets per event?

What's the trend?

Where does the volume actually go?

How many markets actually function?

The rounding tax: a theory

Open structural questions

The resolution gap

The precision penalty

What we're building differently

TL;DR of findings

Notes and limitations

Appendix

Notebooks and data pipeline

How Polymarket represents complex questions

Related work

Explore our additional research