First Social Media, Now AI. Will News Reporting Survive This Latest Parasite?
Jan 12 2024Just after Christmas, The New York Times opened a new front in the yearslong war between news publishing companies and Internet platforms that have appropriated their reporting without compensation. The Times sued Microsoft and the AI development
company, OpenAI, in which Microsoft is heavily invested, for helping themselves without authorization to millions of Times articles to build the large language models that feed the companies' generative-AI engine, OpenAI's ChatGPT.
At a time when hundreds of news outlets have been shuttered during this new century, driven out by Americans getting their news from social media, and now even unlikely sources such as Instagram and TikTok, AI is seen as a new parasite even more deadly.
“Defendants seek to free-ride on The Times’s massive investment in its journalism”, the complaint says.
cordoned off“Times journalism is the work of thousands of journalists, whose employment costs hundreds of millions of dollars per year…Defendants have effectively avoided spending the billions of dollars that The Times invested in creating that work by taking it without permission or compensation."
In the years of struggle with free-riding social media, readers of their news feeds could at least click through to publisher websites which could earn revenue through advertising adjacent to articles and by selling direct subscriptions. The AI threat is that instead of sending the reader to a publisher website to read an article, it sucks in the publisher's copyrighted work from its vast datasets to write summaries that may suffice for readers. The publisher is shut out.
Since May, Google has segregated 10 million users to test an AI product called "Search Generative Experience", which it openly says it intends to merge into its search engine. Publishers see the legal dilemma of Google responding to inquiries with summary answers that are a mix of content from any number of publications ingested into the huge "language models", which will make it difficult to prove just what was derived from which publication. Google would sell ads against those search results; publishers that supplied the content would get nothing.
Along with data analysis firm SimilarWeb, The Wall Street Journal found that Google generates an average of 40% of the traffic into publisher sites, and further, that test runs at The Atlantic found that AI coupled with Google search would satisfy reader queries enough to shut down 75% of that traffic.
The Google vice president involved with AI search says driving traffic to web publishers is of uppermost intent, but there is no indication from the process just described of how that would come about. If that doesn't happen, news and information publishing will atrophy. Newsroom staffs will be still further reduced 200 more buyouts at The Washington Post as this is written and the breadth and quality of journalism will suffer. It is a snake eating its tail, with ever less material generated for the language models to scrape up with Google and others coming up with ever less of merit in their search results.
Seemingly as an answer, Google and other AI developers promote their products as being a boon to newspapers and magazines. Their reporters and writers can use their AI-search engines to efficiently generate story abstracts from material in their leviathan datasets, cutting time and cost. Small local newspapers, no longer able to pay for reporters, are already learning how to let AI write their articles. That might at least free up whoever is left on staff to keep an eye out for local corruption.
fair use?A spokeswoman at OpenAI said they were "surprised and disappointed" by the Times suit, saying that “Our ongoing conversations with the New York Times have been productive and moving forward constructively". Those conversations began last spring and nine months later still no deal sounds more like big tech not budging from its larcenous practices.
The AI companies based their commandeering of publication work-product on a legal concept called "fair use", which allows anyone to incorporate modest excerpts of another creator's output in their own work. The courts have accordingly thwarted attempts by writers such as book authors to successfully sue over modest excerpts. But the Times complaint shows several examples where OpenAI's and Microsoft's tool sweep up large tracts of Times editorial material verbatim. AI generated news will be plagiarism on a grand scale.
Microsoft is a $13 billion investor in OpenAI, the creator of ChatGPT. Bing is Microsoft's counterpart to Google. The tech behemoth has incorporated ChatGPT into Bing in a feature called Browse With Bing. Wirecutter is the New York Times' product review site which makes money when users click through to buy a product that the site recommends.
When the Times ran tests it discovered that Browse With Bing lifted product reviews it had scraped from the Times almost verbatim. The Bing results had no links to Times articles, and even stripped the click-through links, which will deprive the Times from making any money from its work. The reader sees information completely divorced from the Times with no compensation. “Decreased traffic to Wirecutter articles and, in turn, decreased traffic to affiliate links subsequently lead to a loss of revenue for Wirecutter,” the Times complaint states.
To the extent that a news feed identifies The Times as a source of articles, the newspaper is alarmed that chatbots will conjure errors, misinformation, and imaginary falsehoods called "hallucinations" that will tarnish its brand which chatGPT did produce in Times' testing.
The lawsuit does not state a dollar amount sought by the Times but cites “billions of dollars in statutory and actual damages” related to the “unlawful copying and use of The Times’s uniquely valuable works.” Conceivably, the company wants to force an ongoing, profitable contractual relationship for the use of its output.
Others have already done so; the Associated Press and and Axel Springer, the German publisher that owns Politico and Business Insider have struck licensing deals with OpenAI. In the meanwhile, The Times asks the court to enjoin the tech companies from using Times' content and to erase datasets that contain million of pieces of that content on which they were trained.
social nemesisThe mortality of newspapers in this country poses the serious question of where will Americans get in depth news in the years to come. The Associated Press reports that since 2005 the nation has lost one-third of its newspapers and two-thirds of its journalists. Every week in 2023 another 2.5 newspaper closed up shop, an increase from two a week the year before.
Social media and other online platforms have been the primary cause of the decline of major newspapers over the last couple of decades. Able to pinpoint those most likely to buy products by their ability to track people's interests as they prowl the Internet, social media is a more effective medium for advertisers than newspapers. To add injury, several online platforms developed news feeds, not by doing any reporting, but by taking for themselves the work of print media and television.
Pew Research says 56% of Americans prefer getting their news on digital devices, and 39% of them go to Facebook. That has left publications to rely on social media for traffic via click-through links. And it means that hundreds of publications need to compete with one another around the clock in that single news feed channel to get the attention of the algorithms that choose the news to get that traffic. Pressing giants such as Facebook and Google for licensing fees has always been problematic. It runs the risk of those outlets downgrading or dropping them.
For years, news executives have criticized major tech companies like Google and Facebook for aggregating and distributing articles in their platforms without shouldering any of the financial burden of gathering the news. There have been payment arrangements, but insufficient. In 2019, Apple launched Apple News+ for $10 a month for unlimited access to hundreds of publications, but would distribute 50% of the revenue among them and turned over none of the customer data it acquired. That same year Facebook unveiled Facebook News, dedicated only to news unlike their regular news feed with news from family and friends intermixed, but would pay only the major publishers their service couldn't do without.
The heat rose when in 2020 the Australian government instructed its Competition and Consumer Commission to force Google and Facebook to negotiate payments to newspaper publishers. France's competition commission ordered Google to do the same. Both Facebook and Google made one-time contributions $1 billion over three years in Google's case in the hopes of heading off more costly permanent legislation. When that did work, Facebook retaliated in Australia by blacking out news on its platform, sinking traffic to news sites.
In the process, it also took down Internet access to hospitals, emergency services, and charities.
The company says that was inadvertent, but a year later whistleblowers said disruption of Australian services was deliberate and was viewed internally at Facebook as a strateglic "masterstroke".
Australia passed the payment requirement into law. Canada in 2022 copied Australia. In August of this year, as in Australia, Canadians woke to find that Facebook had shut down news. It is clear that Facebook's, now Meta's, CEO Mark Zuckerberg thinks all the money should be his and publishers should be grateful for Facebook's accceptance of their material.
In 2022, the Journalism Competition and Preservation Act was introduced in the U.S. Senate. It is meant to help smaller publishers to band together for negotiating with the tech titans for the content they expropriate. Facebook threatened to ban news in the U.S. if the bill is passed rather than “submit to government-mandated negotiations that unfairly disregard the value we provide to news outlets.”
The upshot? Zuckerberg has decided to get out of news. Campbell Brown, head of global media partnerships and a former television news reporter and anchor, made the announcement that the company will switch to what it calls the "creator economy", which seems to be more like what TikTok offers. She left the company.
The U.S. bill? Bipartisan at the start, but Ted Cruz got an amendment passed (a Democratic senator was quarantined from Covid giving Republicans a one vote majority) that explicitly prohibits discussion of content moderation in payment negotiations, a move thought to protect conservative news outlets. So there the bill sits.
wholesale piracyIt is not just newspapers that need action from our glacial Congress, whose inaction leaves legislation to the courts. Novelists have discovered that AI large language models have ingested tens of thousands of books. Authors such as John Grisham and Jonathan Franzen have sued, as has Getty Images, targeting a company that uses its content to generate images in response to written requests.
In Silicon Valley, AI is the latest mania; there is no discernible concern for the intellectual property of others. The Times quoted venture capital firm Andreessen Horowitz, an early investor in OpenAI, writing to the U.S. Copyright Office that exposing AI companies to copyright liability would “either kill or significantly hamper their development.”
Please subscribe if you haven't, or post a comment below about this article, or
click here to go to our front page.
Excellent description of a looming, or existing, threat to journalism, hence a threat to our democracy. Without an independent and free press, citizens will not have a reliable source of information and social media will continue to impact users negatively and influence our elections.
Unfortunately, Congress has already dropped the ball and the Republican House will continue to focus on trivial or non existence issues.