Can You Solve Connections Better Than a Computer?

Senior Science Reporter

Artificial intelligence might already be able to beat us at chess, but how does it fare when it's faced with language-based puzzles?

In a bid to find out, a group of researchers from New York University's Tandon School of Engineering challenged a variety of modern natural language processing systems, known as NLPs, to solve The New York Times' daily puzzle, Connections.

Connections is a word game in which players have five attempts to group 16 words into four groups of four based on their connections. The groupings range from "simple" (connected through straightforward definitions) to "tricky" (connected through abstract word associations that require more critical thinking).

"Large Language Models are becoming increasingly widespread, and investigating where they fail in the context of the Connections puzzle can reveal limitations in how they process semantic information," Graham Todd, a PhD student in the NYU Game Innovation Lab, said in a statement.

Connections Solving Speed — Newsweek illustration. Researchers challenged AI programs to solve The New York Times' Connections puzzle in a new study. Photo Illustration by Newsweek

In a recent study, Todd and colleagues at the NYU Game Innovation Lab, including its director, Julian Togelius, set up an experiment to explore how two different artificial intelligence approaches fared in a game of Connections. The first approach leveraged OpenAI's familiar ChatGPT (specifically GPT-3.5 and the recently released GPT-4); the second used sentence embedding models, like BERT and MiniLM, which lack the full language understanding capabilities of large language models.

All in all, the AI struggled to complete the Connections puzzles. GPT-4, the most successful of the models, only solved about 29 percent of the puzzles, struggling more with the "tricky" associations, just like humans. When GPT-4 was given step-by-step prompts to guide it through the reasoning of the puzzle, its performance was slightly better, at 39 percent.

But the AI was still far from mastering the game.

"Our research confirms prior work showing this sort of 'chain-of-thought' prompting can make language models think in more structured ways," Timothy Merino, another PhD student at the Game Innovation Lab who was a co-author of the study, said in a statement. "Asking the language models to reason about the tasks that they're accomplishing helps them perform better."

Studies like this help researchers benchmark AI capabilities, and they may also explore the potential for models like GPT-4 to be used in the production of novel word puzzles in the future.

The results of the study will be presented at the Institute of Electrical and Electronics Engineers' 2024 Conference on Games in Milan, Italy, from August 5 to 8.

Uncommon Knowledge

Newsweek is committed to challenging conventional wisdom and finding connections in the search for common ground.

Can You Solve Connections Better Than a Computer?

Uncommon Knowledge

About the writer

Police Hunt Man Accused of Stealing Jeep With Baby Inside

Parents Charged With Murder After Infant's Brutal Death

Trump Hush Money Trial: Latest Polling in 3 Charts

Amal Clooney Reveals Why She Supports Netanyahu Arrest Warrant

5 Apps That Could Make You Richer

Top stories

Biden Admin Raises Questions About Netanyahu ICC Arrest Warrant

North Korea Reacts to 'Nuclear Threat' From US

Map Shows Countries Israel's Netanyahu Could Be Blocked From Visiting

Northern Lights Could Return Tonight as Solar Storm Charges Aurora Borealis