Can You Solve Connections Better Than a Computer?

Artificial intelligence might already be able to beat us at chess, but how does it fare when it's faced with language-based puzzles?

In a bid to find out, a group of researchers from New York University's Tandon School of Engineering challenged a variety of modern natural language processing systems, known as NLPs, to solve The New York Times' daily puzzle, Connections.

Connections is a word game in which players have five attempts to group 16 words into four groups of four based on their connections. The groupings range from "simple" (connected through straightforward definitions) to "tricky" (connected through abstract word associations that require more critical thinking).

"Large Language Models are becoming increasingly widespread, and investigating where they fail in the context of the Connections puzzle can reveal limitations in how they process semantic information," Graham Todd, a PhD student in the NYU Game Innovation Lab, said in a statement.

Connections Solving Speed
Newsweek illustration. Researchers challenged AI programs to solve The New York Times' Connections puzzle in a new study. Photo Illustration by Newsweek

In a recent study, Todd and colleagues at the NYU Game Innovation Lab, including its director, Julian Togelius, set up an experiment to explore how two different artificial intelligence approaches fared in a game of Connections. The first approach leveraged OpenAI's familiar ChatGPT (specifically GPT-3.5 and the recently released GPT-4); the second used sentence embedding models, like BERT and MiniLM, which lack the full language understanding capabilities of large language models.

All in all, the AI struggled to complete the Connections puzzles. GPT-4, the most successful of the models, only solved about 29 percent of the puzzles, struggling more with the "tricky" associations, just like humans. When GPT-4 was given step-by-step prompts to guide it through the reasoning of the puzzle, its performance was slightly better, at 39 percent.

But the AI was still far from mastering the game.

"Our research confirms prior work showing this sort of 'chain-of-thought' prompting can make language models think in more structured ways," Timothy Merino, another PhD student at the Game Innovation Lab who was a co-author of the study, said in a statement. "Asking the language models to reason about the tasks that they're accomplishing helps them perform better."

Studies like this help researchers benchmark AI capabilities, and they may also explore the potential for models like GPT-4 to be used in the production of novel word puzzles in the future.

The results of the study will be presented at the Institute of Electrical and Electronics Engineers' 2024 Conference on Games in Milan, Italy, from August 5 to 8.

Uncommon Knowledge

Newsweek is committed to challenging conventional wisdom and finding connections in the search for common ground.

Newsweek is committed to challenging conventional wisdom and finding connections in the search for common ground.

About the writer


Pandora Dewan is a Senior Science Reporter at Newsweek based in London, UK. Her focus is reporting on science, health ... Read more

To read how Newsweek uses AI as a newsroom tool, Click here.
Newsweek cover
  • Newsweek magazine delivered to your door
  • Newsweek Voices: Diverse audio opinions
  • Enjoy ad-free browsing on Newsweek.com
  • Comment on articles
  • Newsweek app updates on-the-go
Newsweek cover
  • Newsweek Voices: Diverse audio opinions
  • Enjoy ad-free browsing on Newsweek.com
  • Comment on articles
  • Newsweek app updates on-the-go