After hobbling the news business by grabbing most of the advertising revenue and normalizing the giving away of content for free, Big Tech is using original articles created by journalists at surviving outlets to train its AI models, without giving credit to their work or providing any kind of compensation.
Big Tech companies are, in fact, hoovering up the content not only of newspapers and magazines but artists, authors and musicians, ballooning their own valuations while threatening the livelihoods of content creators. Copyright lawsuits filed against GenAI companies abound, alleging that the way they operate amounts to theft.
Bill Gross, one of Silicon Valley’s most prolific entrepreneurs, believes there is a better way than lawsuits to combat the problem: using tech of his own invention.
Generative AI cannot thrive on a foundation of stolen or uncredited content—it’s neither sustainable nor just, says Gross, CEO of ProRata.ai, a new company that uses tech to enable generative artificial intelligence (GenAI) platforms to attribute and compensate content owners.
Among other things Gross, who has created more than 150 companies with more than 50 IPO’s and acquisitions over the last 30 years, is widely credited with inventing “pay-for-click”, a novel way for search engines to make money on advertising, when he was running a company he founded in 1998 called GoTo.com. Instead of paying for page-views—an old-media model—advertisers pay only when people click on their ads.
Google paid GoTo.com to license its tech and pay-to-click would go on to create a multi-billion-dollar advertising business. Then came along the latest disruption: Generative AI models like ChatGPT that respond to questions with knowledge gained from crawling content without credit or compensation, essentially giving the builders of large language model a free ride on the massive investment made by the small number of surviving media outlets that have built successful business models from online journalism.
At the same time, OpenAI and other AI tech firms — which use a wide variety of online texts, from newspaper articles to poems to screenplays and books – to train chatbots, are attracting billions of dollars in venture capital.
It doesn’t need to be a zero-sum game, says Gross. YouTube, which started out by using other people’s content, saw its business thrive when it started revenue 50/50 with creators, and music streaming service Spotify has paid out billions of dollars to artists so “it is completely possible to pay creators and make a viable business,” says Gross, who spoke about ProRata in January at the DLD technology conference in Munich and at an Axios side event at the World Economic Forum’s annual meeting in Davos.
“Why should Generative AI be an exception?” he asked during an interview with The Innovator.
Whereas Spotify is based on the number of streams, with Generative AI the challenge was to figure out the proportionate contribution to an answer. Gross invented tech that can reverse-engineer where an answer came from and what percentage comes from a particular source so that owners can be paid for the use of their material on a per-use basis, says Gross, who has patented the technology. ProRata pledges to share half the revenue from subscriptions and advertising with its licensing partners, help them track how their content is being used by AIs, and aggressively drive traffic to their websites.
When a user poses a query ProRata’s algorithm compiles an answer from the best information available. At the top of the page there is an attribution bar which specifies where the answer came from. It might say, for example, 30% of this answer came from The Atlantic, 50% from Fortune and 20% from The Guardian. The publications are immediately compensated according to their contribution to the answer and a side panel displays the original articles and enables users to click on the original source to learn more.
Think of it as “attribution-as-a-service.,” says Gross. “Just as Nielsen measures how TV shows are watched to determine what advertisers should pay, we are moderating the output of the queries to determine how much GenAI providers should pay content providers.”
For starters Gross is launching a GenAI search engine called Gist.ai that only consults the archives of participating publishers. Some 400 publishers have signed on so far, including The Atlantic, Time Magazine, Fortune, The Guardian and Skynews, contributing some 50 million documents. So have book authors such as Adam Grant and Walter Isaacson and Universal Music as the same technology can be used to attribute credit to images, music and movies.
Expect other types of content providers to follow. In his presentation at DLD Gross demonstrated how his technology could determine that an image of a masked superhero provided by Meta was generated using 90.3% of material from Marvel Comic images and 6.2% from DC Comics.
Gross plans to charge $20 a month for individuals to use the Pro version of Gist.ai, the same rate charged by ChatGPT. The difference, says Gross, is that Gist.ai will only use trusted sources of information and has the buy-in of the content owners, who are ethically compensated on a per-use basis.
“This empowers the long tail,” says Gross. “You don’t have to be a big brand” to take advantage of the service, he says. A growing number of professional journalists are trying to monetize their content, but many have struggled to make a living using sites such as Substack, which, like Prorata, bills itself as a new economic engine for content providers.
Once more publishers join ProRata it will put pressure on GenAI platforms to share revenue with content providers, says Gross. He hopes to eventually get Microsoft, Amazon and maybe even Google to license its technology. “We want to convince the industry that if you want to crawl people’s content you should share,” he says.
Gross says he was shocked to see statistics from the tech company Cloudflare that demonstrated that 10 years ago Google crawled two pages for every visitor to a website, but today it crawls six pages for every visitor it sends, making it three times harder for content providers to monetize. OpenAI crawls 250 pages for every visitor it sends and Anthropic crawls 250,000 for every one visitor. Why is it so little? “They obscure where the content is from so there is almost no reason to go to a site,” says Gross. “I want us to get to a fairer value exchange. “
He hopes a combination of things will help convince GenAI platforms to compensate content providers. If guilt and lawsuits don’t work if, over time, more and more publishers block their content from being crawled by GenAI platforms the quality of their chatbot’s answers will deteriorate and people will vote with their feet, choosing to instead access answers from the content of trusted publishers, Gross says.
“This may finally be the time to create the perfect information marketplace,” he says.
To access more of The Innovator’s Focus On AI stories click here.