Researchers develop new method to prevent AI from hallucinating, according to a new study

Researchers looked at a method that can predict when an artificial intelligence (AI) text model is likely to “hallucinate”, according to a new study. ©Canva

A method developed by a team of Oxford researchers could prevent AI models from making “confabulations” which are a certain type of hallucination or inaccurate answer, according to a new study.

As the hype for generative artificial intelligence (genAI) continues, criticism has increased regarding AI models’ hallucinations. These are plausible-sounding false outputs from large language models (LLMs) like OpenAI’s GPT or Anthropic’s Claude.

These hallucinations could be especially problematic when it comes to fields such as medicine, news, or legal questions.

“‘Hallucination’ is a very broad category that can mean almost any kind of a large language model being incorrect. We want to focus on cases where the LLM is wrong for no reason (as opposed to being wrong because, for example, it was trained with bad data),” Dr Sebastian Farquhar, from the University of Oxford’s Department of Computer Science, told Euronews Next.

“With previous approaches, it wasn’t possible to tell the difference between a model being uncertain about what to say versus being uncertain about how to say it. But our new method overcomes this,” he added in a statement about the study published today in the journal Nature.

‘Getting answers from LLMs is cheap, but reliability is the biggest bottleneck’

The method works by measuring the uncertainty or variability in the meaning of outputs through semantic entropy.

It looks at the uncertainty in the meanings of the responses rather than just the sequence of words.

For instance, if a language model is asked a question and it generates several possible responses, semantic entropy would measure how different these answers are in terms of their meaning.

The entropy is low if the meanings are very similar, indicating high confidence in the intended sense. If the meanings are very different, the entropy is high, indicating uncertainty about the correct meaning.

“When an LLM generates an answer to a question you get it to answer several times. Then you compare the different answers with each other,” Farquhar said.

“In the past, people had not corrected for the fact that in natural language there are many different ways to say the same thing. This is different from many other machine learning situations where the model outputs are unambiguous,” he added.

When tested on six LLMs (including GPT-4 and LLaMA 2), this new method was better than others at spotting questions drawn from Google searches, technical biomedical questions, and mathematical problems that were likely to cause false answers.

However, the method requires more computing resources compared to simple text generation.

“Getting answers from LLMs is cheap, but reliability is the biggest bottleneck. In situations where reliability matters, computing semantic uncertainty is a small price to pay,” said Professor Yarin Gal, the study’s senior author.

Hallucinations are one of the main criticisms against LLMs. Google recently turned off its new AI Overview feature after facing backlash regarding misleading answers.

© Euronews