Information as Supply

Information as supply

On a recent episode of Cheeky Pint, Tarek and Luana from Kalshi produced a salient stat: "80% of users visit to check odds, not to trade." This sets the scene for one of the more interesting and mind-bending questions for prediction markets:

If information is still the product, what is the TAM for prediction markets? Is TAM even the right frame?

Our springboard is a recent A1 research piece that maps the demand-side addressable markets. $125B sports betting, $15T event-driven financial derivatives, and $25B weather derivatives. They establish a credible, mathematically derived floor of $85-$90 billion in annual volume with a challenging but attainable path to $1T. We largely agree with this estimate with exception on the inclusion of the whole ~$15T financial risk derivatives as a base. A large chunk of this cannot be optimised due to (at the least) capital efficiency limitations still imposed on binary PM's in addition to microstructure and institutional design ceilings, which shows why the answer here isn't just: PM TAM = $15 trillion.

TAM frameworks typically only measure the engine's fuel (transaction volume), completely missing the value of the systemic output. So in addition to demand, we want to add the supply-side argument:

The thesis

We believe prediction markets will reach $1T because the cost to produce real-time probability estimates is collapsing while the number of decisions that benefit from them is effectively unbounded.

Completing arithmetic on this type of proposition is tricky. The transition represents far more than a simple increase in transactional volume; it embodies a paradigm shift in how society integrates probabilistic information into existing systems.

Determining exactly how information will or should be useful is a classic Bounded Rationality problem. We cannot predict the exact emergent behaviours of a complex system once a new, powerful feedback loop (cheap, accurate probabilistic data) is introduced. Instead, we should focus on the core principles.

What we do know: the cost to produce useful forecasts is collapsing, the distribution constraints are loosening, and the information product already outperforms many alternatives at trivial scale. The trajectory toward a trillion-dollar market is highly complex. To achieve this scale, the ecosystem must resolve critical structural bottlenecks spanning liquidity formation, market creation accessibility, regulatory friction, and retail-to-institutional distribution channels.

Information will always be the product

In the same podcast, Tarek references Kevin Hassett's paper arguing that as society gets more complex, asset pricing requires markets on the individual factors driving outcomes, not just the aggregate. "You need infinite markets... prediction markets started this notion." Example he used. Citrini's AI crisis report triggers a sell-off, Kalshi launches a market, the odds settle at 10% "If you can put that back into pricing models, maybe the markets wouldn't have reacted."

In classic information economics, information goods have high fixed costs (creating the first "copy") and low marginal costs (reproducing or distributing additional copies). (Varian, Shapiro)

The existing demand for foresight is currently serviced by the ~$418 billion financial services consulting industry, of which:

~$42 billion is pure financial data and analytics spending (Burton-Taylor).
Bloomberg LP alone generates roughly $13-15 billion in annual revenue, ~85% from the Terminal.
BCG reported record global revenues of $12.3 billion in 2023, partly by providing the exact kind of strategic foresight that prediction markets can crowdsource and feed to decision systems in real-time.

How much is therefore the latent demand for information on the future?

By leveraging financial incentives to extract information from a broad, global participant base, PMs successfully reduce the latency, structural limitations, and biases inherently present in traditional polling mechanisms or centralised expert panels. This significantly lowers both fixed and marginal costs required. These are the principles.

While cost is the valid frame, the liquidity formation path must still be met.

Liquidity formation is ordered

There is a sequencing to how prediction market liquidity develops, and getting this ordering right matters more than the absolute depth at any single point.

Where is skin in the game fun? (Entertainment demand)

We are largely here right now: sports, politics, pop culture. The consumer IS the participant. Sports accounts for 50-80% of all PM volume. This is not a temporary phase to be outgrown. It is the early foundation on which everything else is built. Entertainment demand subsidises the production cost of the forecasting infrastructure that later serves more serious use cases.

Where is forecasting useful? (Information demand)

We are starting to arrive here. Economic indicators, policy outcomes, corporate earnings, climate. The Federal Reserve Board now publishes Kalshi-derived probability estimates alongside its own survey data. The consumer at this stage is the decision-maker who needs the forecast. PM data is already better than alternatives at a fraction of the cost. The product is the probability estimate; trading is the production mechanism. Both important.

Do we need institutional adoption?

The most common path-to-$1T argument is the derivatives replacement thesis: PMs become hedging tools for institutional capital. Per our research, no serious institutional capital is deploying for hedging via PMs yet. The instruments are still too thin, and the legal infrastructure is still being defined. And not all derivatives can be replaced by prediction markets. While some argue basis risk is still high and speculator counterparties are lacking on bespoke contracts, this is a reflection on existing system structure rather than a mathematical limitation.

So far, PMs have shown that institutional adoption isn't a prerequisite for liquidity to form but a consequence. If the information product continues to prove itself this step grows. We know institutions are utilising the data already, the next step is deploying capital.

Prediction markets are structurally unique

Prediction markets are the best known mechanism for producing incentive-aligned, real-time probability estimates on any arbitrary question. This is not merely a marginal improvement over existing alternatives.

Options and futures provide the most liquid and tight signal for outcomes within their domain. You could logically deduce that any prediction market trying to replace a very liquid derivative is unlikely to produce a better price signal, given that traders are using the prices from those established markets to then make predictions. There is a clear ordering of value here: deep, liquid options markets on the S&P 500 will outperform a prediction market on the same underlying. That is not where the opportunity sits.

Instead, it sits in the domains these instruments cannot reach. Exchange-traded derivatives are gated by access requirements, largely standardised, and limited to a narrow set of underlyings. OTC derivatives are slow, expensive, and gated by decades of established trust relationships and permissioned system architectures underwritten by centuries of contract law and state enforcement. Neither adapts easily to novel questions, niche domains, or the kind of granular factor-level pricing Hassett describes. Weather derivatives represent a $25 billion market concentrated among massive energy and agricultural firms. PMs can expand access to smaller agents that are currently priced or trust-gated out of these bespoke OTC contracts. The same logic applies across insurance, geopolitical risk, regulatory outcomes, and hundreds of other domains where the demand for probabilistic information exists but the instrument to supply it does not.

Diagram mapping uncertain world-states on the left to a large box on the right labelled Prediction market outcome spaces, via a solid arrow marked Direct state-exposure; a dashed branch labelled Multi-variable state exposure points to a smaller box, Tradable economic instruments (without PMs).

Figure 1. Direct vs proxy properties enable a more expanded question-to-market surface. Prediction markets can give direct exposure to uncertain world-states that existing tradable instruments can only reach through multi-variable proxies.

The structural claim is this: prediction markets are not just trying to be better futures markets. They are trying to be the forecasting layer for everything futures markets cannot or will not cover.

To improve the information product we need more markets, many many more

The long-tail of markets

Kalshi's stated goal of 100,000 markets is the floor. If the information product thesis is correct, the number of useful questions is bounded only by the number of uncertain outcomes that matter to someone, somewhere. That number is very large.

The conventional wisdom in market design is that liquidity must be deep to be useful. Adhi's work on minimum viable liquidity (MVL) offers a more productive frame (which inspired a large chunk of this thesis). The question is not "how much liquidity does a market need?" but rather "how little liquidity can a market have and still produce a useful signal?" This reframing changes the scaling law entirely. Under a deep-liquidity paradigm, every new market is expensive: you need market makers, subsidies, and critical mass before the market functions. Under MVL, every new market is cheap, so the feasible frontier shifts from "a few highly liquid markets" to "massive breadth with targeted depth."

There is latent demand for this long tail among specialised traders. Tarek provides an example, "the best inflation forecaster on Kalshi is a guy in Kansas who never traded before." This is the mechanism working as designed. Individuals and groups converting specialised, localised knowledge into priced information - Hayek in action. As Austin's Messari coverage of culture markets has shown, these categories are growing precisely because they provide a brand new way to trade information that you previously couldn't, or as Tarek puts it, information that previously "just wasn't interesting".

A key constraint in the long tail is not demand but liquidity mech. For example Niche Data Nerd X may possess a highly accurate, research-backed 72% probability assessment for a macroeconomic event currently mispriced at 65% on Polymarket. But the severe lack of depth prevents them from deploying enough capital to generate meaningful absolute returns. The forecaster's maximum profit is mathematically capped, rendering the endeavour financially unviable relative to the hours spent researching the topic. The binary CLOB is not the right tool for this dynamic. It optimises for head markets with deep two-sided flow or for MM income for the specialised. The long tail needs something structurally different and interesting.

Heads or tails

Head markets attract makers organically. High-volume political and sports markets generate enough counterparty flow that market makers can earn the spread reliably, and informed traders can enter and exit at scale. The long tail lacks this self-reinforcing dynamic. It requires mechanisms that can produce useful signals from thin participation, where the information product, not the trading profit, is the primary output.

So far the distribution of value across prediction market questions follows the same power-law pattern visible across every successful platform economy. YouTube, shows familiar power law effects with unverified reports that 90% of content is generated by 1% of creators, on Polymarket approximately 5% of markets demand more than 80% of the liquidity. But YouTube's overall economic dominance (contributing over $55 billion to U.S. GDP in 2024) is secured by the cumulative engagement of its millions of niche creators and billions of specific, long-tail videos that cater to highly specialised audience interests.

Brynjolfsson and co emphasise that online markets can generate significant value by expanding selection and lowering search costs, enabling economically meaningful demand outside the head of the distribution. The nuance, as debates in HBR highlight, is that long-tail value does not mean blockbusters disappear. In some settings, superstar concentration can strengthen. The same is likely true for prediction markets. The CPI and election markets will remain the deepest and most liquid for the time being, but the long tail is where the information product thesis either proves itself or doesn't. If a market on Brazilian soybean export policy with $3,000 in open interest still produces a probability estimate that outperforms the best available alternative, the mechanism is working.

If we apply a contrast of YouTube to Traditional Media, then the "Who will trade on a soybean policy market?" is answered with: "The same type of person who watches a 2-hour video on 14th-century plumbing."

# of markets × open access = information²

For the long tail to produce value at scale, several structural conditions must be met.

Creation should be un-gated. If launching a new market requires permission from a platform (or worse), the breadth of questions that get priced will always be limited by that editorial/policy judgment, regulatory appetite, and commercial incentives. A well-functioning long-tail is not something a centralised actor could operationalise. The question space is too large and too heterogeneous for any single entity to curate effectively. AI agents may do this, and they do not operate on an app layer. Apps instead can be the place where the users meet the prediction question demand, embedded within required information pipelines.
Discoverability must be solved. A million markets are useless if the participants with relevant expertise cannot find them. This is the prediction market equivalent of search and recommendation: the system needs to route specialised knowledge to the questions where it is most valuable. This is a distribution and infrastructure challenge, not a liquidity challenge.
Distribution should be open. Luana frames this, "how do we get very sustainable, solid liquidity in this long tail of markets, that's how we think about our moat long term." The answer includes moving beyond the binary contract as the default primitive. Binary markets compress belief into a single dimension, fragmenting capital across parallel yes/no positions. Moving toward richer instruments, continuous ranges, conditional markets, combinatorial structures, expands the surface area for information expression without requiring proportional increases in participation.

Liquidity in the long tail will not look like liquidity in the head. It doesn't need to. What it needs is enough depth to produce a signal that is useful to someone. The bar for "useful" is lower than the bar for "tradable at institutional scale," and that difference is where most of the latent value sits.

How?

For prediction markets to become the general-purpose information infrastructure the thesis requires, two things must be true:

Any uncertain outcome must be representable and tradable as a primitive.
Any developer must be able to use that primitive without becoming a venue.

Apps cannot solve (2). They are structurally incentivised to hoard liquidity, user attention, and resolution control. A venue that creates markets, hosts trading, and adjudicates outcomes has every incentive to keep participants within its walled garden. This is rational behaviour for the venue, but it constrains the system's informational throughput.

A neutral substrate can solve (2), but only if it is not owned by one app, its economics are sustainable, its primitives are expressive enough, and its resolution mechanism is robust and credibly neutral.

This distinction between apps and substrates maps directly to the platform-versus-protocol debate that played out in DeFi's earlier layers. Uniswap as a protocol enables an ecosystem of interfaces, aggregators, and applications that collectively produce more liquidity and more utility than any single app could alone. The same architectural logic applies to prediction markets: the value of the probability estimate scales with the number of systems that can consume it, and the number of systems that can consume it scales with the openness of the infrastructure that produces it.

The reinforcing feedback loop

When the structural conditions above are met, a reinforcing (positive) feedback loop emerges. This is the driver of exponential growth in the system:

Circular feedback-loop diagram titled The information supply engine, with arrows cycling clockwise through five stages: Retail entertainment, Prosumer information, Institutional data consumption, Institutional participation, and Increased liquidity, back to Retail entertainment.

Figure 2. The information supply engine. Entertainment demand subsidises forecasting infrastructure, which feeds prosumer and institutional data consumption, which draws participation and liquidity back into the system.

Lower cost of production. Technological and structural changes (loosening creation constraints, cheaper computation, AI-assisted market design) widen the inflow valve, making it cheaper and easier to create prediction markets.
Increased output. This generates a larger, more diverse stock of probabilistic information across domains that were previously unpriced.
Unbounded application. Because the cost is collapsing, this information can be integrated into everyday decision-making systems. Corporate risk models, insurance pricing, policy design, journalistic fact-checking, personal finance, autonomous agents. As mentioned, the number of decisions that benefit from better probability estimates is effectively unbounded.
Increased utility and demand. As more external systems rely on this information, it draws more attention, more liquidity from the 20% who trade and further subsidises the cost of forecasting. The information product improves, which attracts more consumers, which generates more demand for new markets.
Loop repeats.

This is the same dynamic that drove the expansion of search, social media, and cloud infrastructure. The marginal cost of the core utility collapses, the number of use cases that benefit from it expands faster than anyone can enumerate in advance, and the resulting demand funds the next iteration of the infrastructure.

The difference is that prediction markets produce something with a specific economic property that search results and social posts do not (the probability). That property makes the output composable with financial systems in a way that other information products are not.

Conclusion

If prediction markets only "eat" existing markets, the TAM is large, but it remains coupled to existing reasons people already trade (entertainment and hedging) and existing constraints of standardised financial products. The $85-90 billion floor is credible. The path to $1T in volume might be attainable through straightforward expansion across sports, politics, economic indicators, and early-stage institutional data consumption.

But the supply-side thesis is that the bigger unlock is not substitution but cost collapse. The marginal cost of producing "good probabilities" falls so far that prediction markets stop being "a place to trade" and become global information infrastructure, a generalised layer for publishing, verifying, and consuming real-time expectations.

The cost to produce useful forecasts is collapsing. The demand for better probabilistic information is existing and large serviced today by a $413 billion consulting and data industry, latent demand is even bigger. The information product already outperforms alternatives at trivial scale in the existing domains.

The correct framing is not "prediction markets versus derivatives." It is: what happens when the cost of knowing the future's probability distribution approaches zero?

How many decisions become better once good probability estimates are cheap and accessible? Trading TAM has a calculable answer, information TAM is not so clear, which is why TAM may not be the right frame at all.

P.S. — pricing the info TAM

A fun and simple one line equation that we could potentially use to price the info TAM:

Info TAM

Info TAM = Demand TAM / 20%

I.e. if 20% trade and 80% consume rule holds - then whatever trade figure you apply, hypothetically you can deduce it creates 5x the value in information.

Igor leads research at @functionspaceHQ — an open-source project exploring market-led resolution and novel economic instruments for prediction markets.

Information as Supply