Machine-Made News Goes Mainstream
Future News 160: The first foray into post-ChatGPT research is here
It’s all starting to get a bit synthetic. A large-scale study of machine-generated articles has found that misinformation farms have increased their production rates by almost 350% between the start of 2022 into the first quarter of 2023.
The Stanford University scientists behind the findings also claimed that mainstream/reliable news outlets had increased their own synthetic output over the same period by just under 80%.
“We find that while mainstream/reliable news websites have largely utilized synthetic articles to report on financial news, COVID-19 statistics, and sports, misinformation/unreliable news websites have reported on a wide range of topics including US Presidential politics and the Russo-Ukrainian War,” researchers Hans Hanley and Zakir Durumeric said.
Though there is no indication that these named outlets themselves have published machine-generated content, the likes of The Independent, MotherJones, Politico and FiveThirtyEIght are amongst the 2,252 ‘mainstream’ websites used for the research.
But a closer look at the data set raises further questions about the researchers’ own definitions of ‘mainstream’. The left-leaning Canary and the right-leaning ConservativeWoman British political titles also appear in the list. Others may not deem these publications to be ‘mainstream’.
Another health warning around the paper would be that it has been published on arXiv, a popular open-access repository for academic papers. Though arXiv is moderated, it is not peer-reviewed.
On top of that consideration, there is a persistent question surrounding the labelling of ‘synthetic’ content – do we mean to say the whole article, a majority of the article or a section of the article, such as the aforementioned sports statistics or financial metrics, is machine-generated?
Those issues aside, such research provides a useful – albeit rough – barometer for measuring the amount of machine-generated content being generated and how it is being deployed.
As I’ve previously stated, if you take into account the phenomenon of ‘news deserts’, where voters have little or no access to traditional journalism, the impact at the 2024 US and British elections could be substantial.
Bad actors could quickly whirl-up reliable looking social media pages or websites and quickly and easily populate them with synthetic content, skewered towards a preferred political bias.
Though these influence campaigns are fairly unsophisticated, the Stanford researchers found that social media users, namely on Reddit, interacted more with synthetic articles in March 2023 – the same month OpenAI’s ChatGPT-4 was rolled-out – relative to January 2022.
“We observe a 67.8% increase in the number of synthetic mainstream articles posted to Reddit and a 131% increase in the number of synthetic misinformation news articles,” Hanley and Durumeric said.
“This corresponded with a 281% increase in the number of comments on Reddit submissions featuring a synthetic mainstream article and a 631% increase in the number of comments on Reddit submissions featuring a synthetic misinformation article.”
The research, which is one of the first forays into the impact of generative AI systems on news and misinformation ecosystems, did not explore other social media platforms. But we have seen other related technologies, namely bots, manipulate hashtags and attempt to push articles up SEO rankings.
So although a sophisticated reader may not be convinced by a LLM-generated article, its second and third-order impacts could still affect them, albeit more subtly.
The paper also does not address how machine-generated content enters the ‘mainstream’ news ecosystem. For this point, I would bring you back to May when The Observer’s Science and Technology Editor Ian Tucker boasted of stopping a freelancer from getting his AI-generated pitch approved. “The Observer remains AI-free,” he claimed.
Here’s what I wrote at the time:
Tucker would do well to read Nick Davies’ Flat Earth News. Davies, who used to write for The Observer’s sister paper The Guardian, warned back in 2008 that 60% of the news item copy in UK national newspapers were wholly or mainly made up of agency copy.
Since newsrooms have shrunk and PA Media and Reuters, to name just two agencies, have been using AI for years now, how sure is Tucker that The Observer is AI-free? Perhaps, as they say in the AI world, it was just a ‘hallucination’?
In other words, journalists may never actually know that the content that they are using is machine-generated. They are AI blind.
Others aren’t so ignorant. Technology tool NewsGuard has warned of a “new generation” of content farms.
“In April 2023, NewsGuard identified 49 websites spanning seven languages — Chinese, Czech, English, French, Portuguese, Tagalog, and Thai — that appear to be entirely or mostly generated by artificial intelligence language models designed to mimic human communication — here in the form of what appear to be typical news websites,” the organisation has said.
Political manipulation doesn’t seem to be a top priority for the websites, which are “saturated” with advertisements. Money, it seems, is the main concern here. But they could still cause an almighty headache for the programmatic ad industry, which is tasked with algorithmically placing ads across the web.
The scientists at Stanford, meanwhile, are continuing to research a variety of approaches to proactively detect and monitor the spread of disinformation, Hanley told me.
As we wait for such a solution, we at least now have a starting point to consider AI’s true direct impact on the news media ecosystem as there are no signs that LLM development will slow down anytime soon, despite some serious protestations.
If anything, the AI race looks like it is ramping up, with Google now claiming it is closing the gap on ChatGPT. The machines are here to stay.
🤔 News I Found Interesting
CNN tries to move on
20 years of Wordpress
How to beat social media algos
Ben Thompson’s thoughts on Apple Vision
A long-form interview with Mark Zuckerberg
Everyone could be a bidder for The Telegraph Media Group
Google will finally launch its News Showcase, paying 150 US outlets
The BBC gives its story behind its Andrew Tate interview to the BBC
🎥 Videos
📖 Essays
How disinformation is forcing a paradigm shift in media theory
Operation Southside: Inside the UK media’s plan to reconcile with Labour
📧 Contact
For high-praise, tips or gripes, please contact the editor at iansilvera@gmail.com or via @ianjsilvera. Follow on LinkedIn here.
FN 159 can be found here
FN 158 can be found here
FN 157 can be found here
FN 156 can be found here
FN 155 can be found here