AI Learns From Stories. Now Publishers Want To Get Paid

In the frenzy of excitement about the artificial intelligence revolution, a subtle but intriguing debate is now unfolding among journalists and media executives around the world: If AI programs are being trained for our journalism, shouldn’t we be compensated for that work? You could call it “Does AI owe the news?” debate.
A proposed solution now gaining traction in policy and media circles is known as “legal licensing,” where AI companies would have to pay publishers if AI models were trained on their articles. The idea is no longer complicated, and has gained momentum in recent months in various legal and industry circles.
So why now? Almost all productive AI models are trained on large amounts of content from the web, billions of pages of text pulled from the Internet. Along with blogs and scholarly articles, journalism plays an important role in that mix. News stories, investigations, analysis pieces, basically, what journalists put out every day, are used as part of the data that AI models learn from to improve their ability to explain concepts and draw connections.
But in news organizations, that change feels one-sided.
Consider this: A reporter may work for weeks or months reporting and gathering information, conducting interviews, fact-checking and writing a story. The planner plans, the vet the lawyer, and the process takes up valuable time and resources. Then the AI model trains on that task and produces the same in a matter of seconds. Even the news organization that produced the first journalism is not paid a dime.
You can see why publishers might raise an eyebrow.
And this is not a theoretical exercise, it is already being played out in the courts. One of the high-profile cases is the lawsuit filed by the New York Times against OpenAI and Microsoft accusing the company of reporting to train AI models. It’s shaping up to be one of the top copyright cases of the AI age.
Supporters of the license proposal point out that we have been here before. The growth of music streaming evolved the music industry into a model where music streaming services pay artists and rights holders every time a song is played. Some advocates believe that AI could follow a similar pattern, with companies paying in a system that breaks down payments to publishers whose content is used to train models.
That sounds good. But it’s not that easy.
The biggest challenge is figuring out what content to inform the AI model. With music, it’s easy to track how many times a song is played. That doesn’t work with AI training. The problem is that AI models are trained on millions of documents at once, weaving in patterns and probabilities. It is difficult to estimate the value of one article in that training process. Academic researchers focused on AI visualization and training data have only recently begun to explore the question.
On the other hand, tech companies developing AI say new public data laws will disrupt innovation. They argue that AI models learn the way humans learn, by learning more, and drawing information from many, many sources. In their minds, the Internet has always functioned as a public library.
But that compares the naysayers to the naysayers. When a person reads 10,000 articles, they do not ignore a computer program that can answer the questions of millions of people in seconds. AI does it. And that’s what makes news organizations nervous. Governments have begun to pay attention. Some countries have already explored rebalancing policies for technology and journalism companies.
Last year, Australia introduced a plan to force tech companies to negotiate payments with news publishers. It was politicized at the time, but it showed that governments are willing to intervene if they believe that the media ecosystem is at risk. And the numbers are high. The media industry has been struggling financially for years. Ad revenue has gone into major tech platforms, local newspapers have closed or merged, and many outlets are still experimenting with subscription models.
Now AI is here, and some publishers are worried it could drive more readers away from their sites. Consider a scenario: someone asks an AI assistant to summarize a complex news story. AI responds with an intelligent summary. That’s easy, but they may never click through to the newsroom that produced the original report. That’s really what this argument is about. If AI companies benefit from journalism, should they help pay for it?
Some people believe that the answer is simple. Some believe the fees could stifle innovation, or spark nasty legal battles. For now, the debate continues. Policymakers are exploring options, news organizations are advocating for protections, and AI companies are navigating the changing legal landscape.
But one thing seems abundantly clear: The time when AI companies can train themselves on Internet news archives without processing is probably coming to an end. And even if this battle takes place, it will shape the future relationship between journalism and artificial intelligence for years to come.


