Argaam Weekend logo Argaam Weekend logo
العربية
شعار أرقام ويك اند
الرئيسية الإصدارات اشترك تواصل معنا موقع بوابة أرقام
Week #87 > Data Is Not a Commodity: How AI Is Rewriting the Pricing Rules





 

Data Is Not a Commodity: How AI Is Rewriting the Pricing Rules

Share 𝕏
Subscribe

Saudi Arabia is rapidly building an AI-driven economy, but the way value is currently assigned to data risks falling behind the way AI actually creates that value.

Today, most Saudi data and content creators are compensated through one-time payments, even though the AI models trained on them continue to generate revenue long after training ends.

This creates a structural gap between who enables AI value creation and who ultimately captures it. This analysis argues that Saudi data should not be treated as a one-off input, but as a productive economic asset.

When data continues to create value at scale—through repeated use of AI models—efficient economic systems reward that contribution through ongoing participation, not fixed buyouts. This is not a moral claim; it is a pricing and incentive problem.

Viewed this way, the question is no longer whether data creators deserve compensation, but how to design mechanisms that align creator participation, investor returns, and national data policy objectives as AI adoption accelerates.

Economically, the key distinction is simple. Inputs are used up once; assets keep generating value over time.

In modern AI, much of the value is created after training, when models are repeatedly used in the application phase by AI models, which’s known in AI terminology as ‘inference’ phase. That makes high-quality local data closer to a productive asset than a consumable input.

Note "Inference" refers to the phase where a trained model is used to make predictions or decisions based on new, unseen data. After a model has been trained on a large dataset to learn patterns and relationships, inference is the process of applying that knowledge to analyze individual inputs and generate outputs in real-world applications.

share

The Cost of Sharing Nothing

Unlike oil, data does not get used up. A hospital can share its patient records at their consent with ten different research teams simultaneously, and each team gets the full dataset — nothing is depleted, nothing is divided. 

Economists call this property non-rivalry, and it has a profound implication: the more widely data is shared, the more value it can generate across society as a whole, because the same raw material is powering multiple engines at once rather than just one.

This creates a compounding effect that does not exist with physical goods. The more a dataset is used, the more applications it generates, the more refined the insights become, and the greater the return on the original investment in collecting it. In economic terms, data exhibits increasing returns — it gets more valuable, not less, the more it circulates.

ℹ︎
Why This Breaks Standard Economics
In classical economics, the optimal price for any good equals its marginal cost — the cost of producing one more unit. For rival goods, this creates a natural pricing logic.
For data, it creates a paradox. Once a dataset exists, sharing it with one more researcher costs almost nothing. The efficient price, by economic logic, should be zero. But pricing it at zero destroys the incentive to collect data in the first place.

Rival Good (Steel)
margina cost risenonrival good

sharing data

They Bought Your Data. They Kept Your Future

When a Saudi content creator — whether a media company, a dataset provider, a cultural archive, or an individual producer — signs a contract transferring their data or content to an AI company, they are almost certainly signing away far more than they realise.

This is due to what’s known as the incomplete-contracting theory: when contracts cannot specify every future use of an asset, who controls the asset matters because control carries residual rights (the right to decide later uses) and therefore captures much of the residual value

In AI, a one-time “buyout” effectively transfers those residual rights from creators to data acquirers. Despite uncertainty about future monetisation pathways.

So, when a Saudi data owner sells their data to an AI company today, neither you nor the buyer can fully anticipate what that data will power in five years. Will it train a large language model that generates billions in revenue? Very likely the answer is yes.

Because contracts are incomplete, someone has to hold the right to make decisions about all the future uses the contract did not anticipate. That someone is whoever controls the asset.

data contribution

data
Human-Generated Data as Steady Income Stream

Recent market signals show that data owners are already moving toward recurring, usage-linked monetisation rather than selling access once, since the true value of a dataset is not knowable at the time of the transaction.

Reddit’s licensing deal with Google was reported at about $60m per year, explicitly framing human-generated data as a repeatable revenue stream, not a one-off sale. Reddit then demonstrated that human-generated data can be treated as a recurring revenue stream, not a disposable asset.

Before deals like this one, AI companies were largely scraping data freely or negotiating one-time, undisclosed transfers that treated human-generated content as a raw material with no ongoing commercial value.

The Financial Times, for example, has announced a licensing and partnership agreement with OpenAI to bring attributed FT content into ChatGPT and related products. The specific terms of the deal were not disclosed at the time of the announcement.

If the FT deal was a buyout, then it would look like mispricing plus friction, since this model has two compounding problems simultaneously.

It gets the price wrong structurally, and it is expensive and inefficient to operate since every buyout requires its own negotiation, its own legal review, its own pricing guess. That process costs time and money each time — without making the data more valuable or the price more accurate.

The starting point is to shift how data costs are framed. Under buyouts, data is treated as a fixed, upfront expense, while AI revenues are realised over time.

A usage-based system instead prices data as a variable production factor, incurred only when value is generated. This reduces upfront cash burn and aligns costs with realised revenues—an outcome that is often more attractive to investors than large, irreversible data purchases.

Operationally, value is assessed at inference time; namely, the moment an AI model actually produces an output for a paying customer – like answering a question, generating a document, analysing data — is the moment revenue is earned. Not when the model was trained.

That payment might come through an API charge, a monthly subscription, a software licence, or a corporate contract. Revenue is split into three predefined components as shown in the following chart.

Once total revenue is known and attribution is complete, the pool is divided between three parties-each of whom contributed something essential to making AI model work commercially.

model provider

Note on the chart Revenue shares are for illustrative estimates purposes inspired by industry benchmarks — music streaming pays rights holders 15–25%, publishing pays authors 10–25%, AI infrastructure 35%, and  platforms 55%.


Concluding remarks

Saudi Arabia's existing frameworks — the The Personal Data Protection Law (PDPL) and SDAIA's AI Ethics Principles — already treat data as a strategic resource requiring transparent, accountable use.

A usage-based participation model simply extends that logic: not regulating what percentage creators receive, but ensuring AI systems are interoperable, auditable, and transparent enough for fair markets to function and Saudi data value to stay home.

More this Weekend
SISCO HOLDING Holding's PSS Acquisition Is a Bet on Future Value, Not Current Earnings
 
Saudi SISCO HOLDING , through its logistics platform LogiPoint, recently acquired a 51% stake in Public Storage Solutions (PSS).Before outlining out objective analysis, we would first establish the strategic logic behind the bargain.
Read more
The Gray Zone in Pricing Land Around the New Riyadh Airport
 
The new international airport north of Riyadh is not merely a transport project; it is a strategic anchor reshaping the economic geography of the entire northern corridor of the city.With the first phase expected to be completed by 2030—and with a scale that positions it among the world’s largest airports in both capacity and footprint.
Read more
X
Facebook
LinkedIn
download app
Argaam.com Copyright © 2026, Argaam Investment, All Rights Reserved