General AI takes a pause, for now

Future News 168: Is machine-learnt 'emergent behaviour’ a myth?

Oct 23, 2023

Doubt now surrounds the expected capabilities of large language models (LLMs) and the goal to build an artificial general intelligence (AGI) system, at least for now.

Researchers from Arizona State University, The University of Illinois and Google’s DeepMind have concluded across two separate studies that LLMs can’t self-critique their outputs when tasked with certain reasoning goals.

The findings, published over the past weekend and earlier in the month, challenge the thinking that LLMs could improve their own answers in an iterative manner, rather than a human or another computer-powered entity intervening.

The phenomenon had previously been chalked-up to ‘emergent behaviour’ from LLMs, but the researchers looked more closely at the issue.

“Our results on graph coloring call these claims into question,” the team at Arizona State University said. “They show that LLMs are in fact very poor at verifying solutions (in our case, colorings), something that is critical for self-critiquing.”

These sentiments complimented the findings from Google DeepMind and The University of Illinois, who stated: “Our research shows that LLMs are not yet capable of self-correcting their reasoning. This implies that expecting these models to inherently recognize and rectify their inaccuracies might be overly optimistic, at least with the current state of technology.”

In diplomatic language, they went on to pour cold water over the notion that LLMs could evolve autonomously, urging a “more circumspect view” on the issue and encouraging more research into reasoning techniques.

The findings come as global, transnational and national bodies seek to regulate and legislate around AI safety, with the EU introducing an AI Act, the US seeking to “bridge” towards AI regulation and the UK avoiding any new AI-related legislation, allowing businesses to be regulated by existing watchdogs at a sector level.

But the research should be treated with some caution since arXiv, where they are published, does not peer-review the findings of studies before publication, instead they are moderated.

Also, the papers do not dispel the notion that LLMs aren’t good at self-critiquing with external feedback, in many ways they are. And then there is Geoff Hinton, the champion and “godfather” of neural-network technology.

Hinton recently appeared on CBS’ flagship current affairs show, 60 Minutes, to warn that generative AI could overtake humanity. “One of the ways these systems might escape control is by writing their own computer code to modify themselves. And that’s something we need to seriously worry about,” he said.

(2) It’s the GOTV, stupid. It’s always so surprising how little attention political hacks pay to party organisers and canvassing techniques when it comes to elections. Some get-out-the-vote (GOTV) tactics have shown to increase turnout by 10% and more in the US, while similar experiments in the UK, focusing on leaflet drops and door-to-door canvassing, suggest a hike of ~5%. This might partly explain the Conservatives’ walloping by Labour at the Tamworth and Mid-Bedfordshire by-elections. Rob Ford provides a comprehensive breakdown of the results on his Substack.

(3) The Lads in America. London-listed LabBible Group (LBG) made its first material acquisition as a PLC, buying Betches, a US-based digital content production firm, last week. The terms of the deal include a four-year earn-out period, with a max consideration of $54m. The initial consideration payment is $24m, paid through LBG's cash reserves. I previously wrote on LBG’s North American expansion plans here.

(4) Big tech earnings. Microsoft, Alphabet, Spotify and Snapchat will all be reporting later today, with Meta providing its own financial update on Wednesday and Amazon reporting on Thursday. The results will come as the US and the world’s equity markets face choppy waters. Bond yields have surged amid inflation stubbornness and geopolitical uncertainty. The CBOE Volatility Index, otherwise known as the US market’s ‘fear gauge’, is around a six-month high.

(5) PPV is the winner. The recent KSI vs. Tommy Fury boxing match, with WWE star Logan Paul fighting in the co-main event, apparently secured 1.3 million buys. This would make the show one of the biggest commercial successes for boxing in the past decade (as long as costs weren’t exorbitant). It also boosts sports streamer DAZN, which went through a $4.3 billion recapitalisation last year, whilst giving a headache to more established promotional outfits in the fighting world. But overall, the PPV format comes out a winner. Billed as the ‘Battle of the Baddest’, Tyson Fury vs. Francis Ngannou will be the next crossover boxing event on DAZN (plus ESPN+ and TNT Sports) this Saturday.

🎥 Video essays

📖 Essays

📧 Contact

For high-praise, tips or gripes, please contact the editor at iansilvera@gmail.com or via @ianjsilvera. Follow on LinkedIn here.

FN 167 can be found here
FN 166 can be found here
FN 165 can be found here
FN 164 can be found here
FN 163 can be found here
FN 162 can be found here

Tech, Power & Media

Discussion about this post