Binary Events: Does Liquidity Trade The Tails?

Which of Polymarket's multi-market pathologies come from discretising a continuous quantity versus the binary architecture itself? By splitting 18,863 events into continuous (price, margin, temperature buckets) and categorical (candidates, teams) slices and re-running the v1 analysis, functionSPACE shows concentration is architecture-wide, ghost markets are largely a categorical phenomenon, and a continuous-distribution primitive is a sharper fix than v1 suggested.

Markets

What we found last time

In our V1 analysis we looked at what it costs to split a single question into multiple binary contracts on Polymarket. The dataset was the full census of multi-market events - 18,863 events containing 172,869 binary contracts across 6 categories. Two pathologies came out cleanly. Volume concentrates in the top 5 markets per event regardless of how many markets the event contains. Ghost market rate (we defined as <$1k lifetime volume) scales with N (number of markets) - 36% of markets dead at N=11-20, 72% at N=51+.

That cut was multi-market vs single-market i.e. what does splitting cost compared to running the question as one market? But v1 treated all multi-market events the same. "Who wins the NBA championship?" and "What price will BTC close at this month?" sat in the same dataset. One is a list of named teams - a genuinely unordered set. The other is a continuous quantity bucketed into arbitrary price ranges. The concentration findings held across both, but the argument that a continuous-distribution primitive could fix these problems only applies to the second type.

v2 looks at continuous vs the rest

The question for v2 was simple - when we separate events that discretise a continuous quantity (price ranges, margin-of-victory percentages, temperature brackets) from events that list named entities (election candidates, sports teams, award nominees), do the pathology findings survive?

The answer turned out to be more interesting than "yes" or "no". Some findings held across both types. One finding - the ghost market scaling - turned out to be largely a categorical phenomenon, obvious in hindsight but enlightening nevertheless.

How we classified events

We built a heuristic classifier that labels each of the 18,863 multi-market events based on the structure of its sibling market questions. The method extracts the varying fragment across siblings (strip the common prefix and suffix from each question title), then tags each fragment as numeric or categorical using pattern-matching signals.

Label

events

volume

share

categorical

10,179

$25.8B

72.0%

continuous

7,751

$6.7B

18.6%

discrete numeric

42

$2.8B

7.8%

mixed

838

$0.5B

1.4%

ambiguous

53

$0.05B

0.1%

Categorical dominates by volume because of a handful of very large events - the 2024 Presidential Election ($3.7B), NBA Champion ($1.7B), Super Bowl ($1.2B). The continuous slice is $6.7B across 7,751 events, driven by crypto price thresholds, weather brackets, political margin-of-victory questions, and date timing events.

The 42 discrete numeric events are all central bank rate decisions ($2.8B). We excluded these from the continuous headline because the Fed only chooses between a small fixed set of basis-point moves. A continuous-distribution primitive does not add resolution where the outcome set is not continuous in the first place. We explored expanding this exclusion to other institutions (tariff rates, turnout, vote counts, medal counts) and found none of the other candidates are genuinely policy-bounded the way Fed events are. The full decision rules, keyword lists, and QA spot checks are documented in the sprint notes.

The classifier is an experimental model. The point is transparency - the reader can check our working and argue with any specific decision. Here is a sanity check to show how these are split:

How much of Polymarket is continuous and how is it changing?

By event count, continuous has been growing fast. In 2026Q1 continuous events overtook categorical for the first time - 3,012 new continuous events vs 1,192 categorical. By volume, categorical still dominates historically because of election-cycle spikes ($5.3B in 2025Q3 alone). But continuous volume has been growing steadily - from $452M in 2025Q1 to $1.7B in 2025Q4 - driven by crypto and weather events that don't spike with political cycles.

category

continuous vol

categorical vol

total vol

% continuous

Politics

$3,788M

$12,608M

$19,511M

19%

Sports

$87M

$11,814M

$11,962M

<1%

Crypto

$1,872M

$77M

$1,963M

95%

Culture

$463M

$1,097M

$1,569M

29%

Finance

$198M

$176M

$534M

37%

Weather

$264M

$0.4M

$267M

99%

Crypto is 95% continuous. Weather is 99%. These are almost entirely events that discretise a continuous quantity into binary buckets - BTC price brackets, temperature ranges, precipitation thresholds. Sports is the opposite - 99% categorical (team winners). Politics is the battleground: $3.8B continuous (margins, timing, seat counts) alongside $12.6B categorical (election winners).

The growth trend matters because it means the discretisation cost analysis applies to a growing share of the platform. As Polymarket expands beyond high-attention election markets into crypto, weather, and finance, a larger fraction of its events are the type where continuous-distribution primitives would be the natural fit.

Liquidity concentration: both labels hit 90% by rank 6

For each event we ranked markets by volume and computed the cumulative share at each rank. The question is the same one from v1: how many of the N markets do you need to account for most of the trading?

rank

continuous

categorical

1

31.7%

42.8%

3

68.9%

78.0%

5

89.0%

93.2%

10

99.4%

99.9%

Both curves are steep and flatten before rank 10. The median continuous event reaches 90% of total volume by rank 6. Categorical gets there by rank 5.

Concentration in the top handful of markets is a property of the multi-market binary architecture. It holds on both labels with similar magnitude. Categorical is slightly more concentrated because in an election or championship the favourite naturally dominates. In a price bracket event the "at-the-money" bucket is hot but doesn't swallow 43% of volume because adjacent brackets also carry meaningful probability.

The small gap between the labels is worth noting. Continuous events spread volume slightly more evenly across their top buckets, which means there is real multi-bucket trading activity. Traders are engaging with 5-6 buckets, not just one. A continuous distribution would capture that existing demand on a single shared instrument instead of splitting it across 5-6 separate order books.

Concentration vs N

N tier

continuous top-3

categorical top-3

if uniform

3-5

89%

100%

75%

6-10

67%

93%

38%

11-15

67%

79%

23%

16-20

49%

73%

17%

21-30

33%

63%

12%

31-50

28%

62%

8%

Continuous events distribute better at higher N. The cumulative curve shows median behaviour across all N values. This chart tests whether adding more markets helps spread volume more evenly. We plotted top-3 concentration against N for every event, with a red reference line showing where uniform allocation (3/N) would sit.

Ghost markets: a categorical phenomenon

A ghost market is a market within a multi-market event with less than $1K in lifetime volume. v1 found ghost rates of 36% at N=11-20, 60% at N=21-50, and 72% at N=51+. The implication was that adding more buckets creates dead markets.

v2 shows that finding was largely driven by categorical events.

Categorical ghost rate is roughly 2-3x higher than continuous at N=6-50. Sports tournaments with longshot teams, election fields with no-hope candidates, primary races with placeholder names - these are the events where Polymarket creates many binary contracts and most go dead. On continuous events (BTC price brackets, weather temperature ranges, margin percentages) the bucketing is more functional. Most buckets at intermediate N do get traded.

Continuous events still show meaningful ghost rates at N=11-20 (40%) and N=51+ (45%), so the architecture is not perfect.

Note: the N=21-50 continuous tier has only 108 events, dominated by Elon Musk tweet-count events (high engagement across all buckets). The mean (17%) is more representative than the median (0%) for this small sample. We use mean throughout for stability.

Where liquidity falls off

The ghost chart uses a fixed $1K threshold. To show where liquidity actually drops off as a continuous function, we computed a survival curve: at each volume threshold from $10 to $1M, what fraction of markets have at least that much lifetime volume?

Continuous markets survive longer at every threshold. 65% of continuous markets have at least $1K in volume vs 49% of categorical. At $10K the gap is 37% vs 27%.

The more interesting view is the continuous slice broken out by event size:

  • Small events (N=2-10) stay above 80% alive until ~$10K. The architecture works at low N. Most buckets get meaningful volume.

  • Medium events (N=11-20) drop more steeply, with the liquidity cliff between $1K and $10K.

  • Large events (N=21+) drop earliest, with the cliff starting at ~$100.

The liquidity cliff shifts left as N grows. At N=2-10 the cliff is at ~$10K. At N=21+ it is at ~$100. The architecture handles continuous bucketing reasonably well at low N and breaks at high N. The failure point is a smooth function of event size.

Here is a clean table to summarise this:

threshold

continuous alive

categorical alive

$100

73%

59%

$1,000

65%

49%

$10,000

37%

27%

$100,000

11%

9%

What v2 shows

v1 told a story about binary discretisation creating dead markets and un-tradeable tails. v2 splits the dataset by event type and finds the story is more specific than that.

Concentration is a property of the architecture. Both continuous and categorical events reach 90% of total volume by rank 5-6. This holds on both sides of the split with similar magnitude.

But continuous events distribute better as N grows. At N=21-30 the top 3 markets capture only 33% of continuous event volume (vs 63% for categorical). Traders on continuous events are already distributing capital across more buckets. The demand for multi-bucket expression is visible in the data.

Ghost markets are largely a categorical problem. The v1 finding that 60-72% of markets are dead at high N was driven by sports tournaments and election long-shots. Continuous events have substantially fewer ghosts at every N tier. The architecture handles continuous bucketing reasonably well at intermediate N, but this changes as threshold for "ghost" increases.

The liquidity cliff shifts left with N. The survival curve shows continuous markets at N=2-10 stay above 80% alive until ~$10K. At N=21+ the cliff starts at ~$100. The architecture works at low N and breaks at high N, and the failure point is a smooth function of event size.

Polymarket's continuous activity is growing. Continuous events overtook categorical by event count in 2026Q1. Crypto (95% continuous) and Weather (99% continuous) are growing categories. The discretisation cost analysis applies to a growing share of the platform.

Notes and limitations

Analysis based on 18,863 multi-market events and 172,869 individual markets from the Polymarket Gamma API (July 2022 - March 2026). Classification methodology and reproducible notebooks: https://github.com/igorfunctionspace/polymarket-discretisation/tree/main/v2

  • Metadata only, no trade-level data. We used event and market-level volume, liquidity, and prices from the Gamma API. We cannot measure trader behaviour, order book dynamics, or how liquidity changes over a market's lifecycle.

  • Point-in-time liquidity. The liquidityClob field is a snapshot. We used volume as the primary metric and restricted liquidity-dependent analysis to open events.

  • Heuristic classifier. The continuous/categorical labels are produced by a regex-based classifier on event titles, not by manual tagging. The full rules, QA process, and known gaps are documented in the sprint notes. 0.1% of events are ambiguous.

  • 6 categories. We fetched Politics, Crypto, Sports, Finance, Culture, and Weather. Events tagged only as Science, Tech, World, Business, or Entertainment are excluded.

  • ~281 NFL/NBA matchup events ($63M) sit in the continuous slice because their spread/total spans parse as numeric. These are sports betting compound events and should probably be in mixed. 0.9% of continuous volume.

Table of content

No headings found on page

Table of content

More Research

Explore our additional research for more in-depth insights.

Markets

Binary Events: Does Liquidity Trade The Tails?

Which of Polymarket's multi-market pathologies come from discretising a continuous quantity versus the binary architecture itself? By splitting 18,863 events into continuous (price, margin, temperature buckets) and categorical (candidates, teams) slices and re-running the v1 analysis, functionSPACE shows concentration is architecture-wide, ghost markets are largely a categorical phenomenon, and a continuous-distribution primitive is a sharper fix than v1 suggested.

Markets

Binary Events: What Happens When You Split One Market Into Twenty

Let's find out how Polymarket handles complex questions by breaking them into multiple yes/no contracts. By examining metadata from the Gamma API, functionSPACE argues that this "fragmented" approach creates a "resolution gap" where liquidity fails to spread evenly across all outcomes.

Primitives

The Yes Bias Might Not Exist

Polymarket traders have an inherent psychological bias toward "Yes" outcomes. By analyzing over 7,000 events, the researchers discovered that the platform’s editorial tendency to frame questions around dramatic, unlikely scenarios (e.g., "Will a specific event happen?") naturally makes the "Yes" token a cheap long-shot. Their data reveals that traders don't actually care about the "Yes" label; they simply gravity toward cheaper tokens regardless of their name. Consequently, what appears to be a behavioral bias is actually a structural illusion created by price sensitivity and the way markets are designed, where the "No" outcome is the default reality for most unlikely events.

Markets

Binary Events: Does Liquidity Trade The Tails?

Which of Polymarket's multi-market pathologies come from discretising a continuous quantity versus the binary architecture itself? By splitting 18,863 events into continuous (price, margin, temperature buckets) and categorical (candidates, teams) slices and re-running the v1 analysis, functionSPACE shows concentration is architecture-wide, ghost markets are largely a categorical phenomenon, and a continuous-distribution primitive is a sharper fix than v1 suggested.

Markets

Binary Events: What Happens When You Split One Market Into Twenty

Let's find out how Polymarket handles complex questions by breaking them into multiple yes/no contracts. By examining metadata from the Gamma API, functionSPACE argues that this "fragmented" approach creates a "resolution gap" where liquidity fails to spread evenly across all outcomes.

Primitives

The Yes Bias Might Not Exist

Polymarket traders have an inherent psychological bias toward "Yes" outcomes. By analyzing over 7,000 events, the researchers discovered that the platform’s editorial tendency to frame questions around dramatic, unlikely scenarios (e.g., "Will a specific event happen?") naturally makes the "Yes" token a cheap long-shot. Their data reveals that traders don't actually care about the "Yes" label; they simply gravity toward cheaper tokens regardless of their name. Consequently, what appears to be a behavioral bias is actually a structural illusion created by price sensitivity and the way markets are designed, where the "No" outcome is the default reality for most unlikely events.

Mechanism

Information as supply

We argue that prediction market TAM should include the supply side: as the cost of producing real-time probability estimates collapses, the addressable market extends beyond trading volume to every decision that benefits from better forecasts.

© 2026 functionSPACE

© 2026 functionSPACE

© 2026 functionSPACE

© 2026 functionSPACE