How SRI research could de-hallucinate AI

George Lawton Profile picture for user George Lawton January 4, 2024
Summary:
New SRI research is tackling a new approach to reduce AI’s propensity to hallucinate.

Hallucination
(Pixabay)

SRI International (formerly Stanford Research Institute) is pioneering seminal work on the mechanics of Large Language Models (LLMs) that tests their abilities across multiple levels of abstraction. The non-profit scientific research institute has pioneered many fundamental innovations behind the graphical user interface, the internet, Apple’s Siri, and inkjet printers. The current research opens a promising new approach to tackle hallucinations at a more fundamental level than existing approaches. 

It mirrors the way humans learn by starting with basic facts and then thinking about what they mean in different contexts. It's early work but might one day find its way into development and testing tools across the industry. Additionally, they are also exploring how AI’s optimized for different types of processes could help a base model programmatically improve over time. 

SRI researchers working on the new technique include Ajay Divakaran, Michael Cogswell, Pritish Sahu, Yunye Gong, and Karan Sikka. They take a more nuanced approach to what a hallucination means from a practical perspective when it comes to building more trustable AIs. Divakaran explains:

Some scholars argue that everything that an LLM produces is in fact a hallucination since it is relying on patterns that it has learned from data. Some hallucinations accord with our common sense and thus do not come across as hallucinations. We don’t take that view. In our view a hallucination is when the large model generates content that is either not grounded in fact and/or not logically sound (consistent).

From this perspective, the idea of de-hallucinating a chatbot means to have it produce coherent, factually, and logically sound content so that the conversation with the chatbot can proceed as it would with a human being. For example, they are currently working with a chatbot implementation of motivational interviewing. They found that the state of the large LLMs kept asking the same question repeatedly, which is sure to put off the human user. 

They believe that it is important to focus on the process of building trust with human users. Divakaran says:

True communication can happen only if somehow trust is won. Trust is typically won by display of understanding of context and the human’s responses.

Automating the process

Many existing techniques to reduce AI hallucinations test out responses against humans using Reinforcement Learning with Human Feedback (RLHF). This is a somewhat manual process in which humans provide corrective input, and the models are retrained. The SRI research is exploring how to automate this by replacing humans with AIs that excel at answering questions at different levels of abstraction using RLAIF. 

This kind of automation could make it easier to scale up the testing process while still delivering better results. 

A key aspect of this is the development of semantic consistency checks, which ensure the consistency of attributes in a data model. In an enterprise context, this could characterize how well similar data objects share consistent names in meanings. In the question-answering context, it could be used to describe the same things and how they might be applied across various chains of reasoning. 

Additionally, the researchers are exploring how the same technique could be applied to visual question-answering (VQA) programs that might answer questions about pictures or generate images in response to a prompt. They found that existing LLMs sometimes generated images of people even when they specifically asked them to be removed because the LLM had never seen pictures of a food truck with no people around it. 

The researchers are also working on techniques that mimic how humans learn to understand the world. There is vast disagreement about whether statistical approaches to AI can cultivate a human-like understanding. However, there is an opportunity to teach LLMs not only to spit out plausible text but also to explain the chain of reasoning involved. Divakaran explains:

Our assertion is that we should use techniques inspired by human understanding because even if the large model produces correct answers it should have the ability to explain how it got those answers in a human intelligible fashion. At the moment, these models are trained omnivorously on various content, and there is a hope that they line up with understanding.

Our latest results show that while these models do well at the lower levels of Bloom’s taxonomy of skills, they do not do as well at the higher levels since that requires logical extension and composition of the kind that they cannot always memorize as easily. Note that we human beings have never quite defined what understanding is, and now these models have come along and challenged whatever notions we have had. Notice how I said that ‘they cannot always memorize.’ Which means that often even at the higher reaches the models are able to ‘cheat’ by just relying on a guess based on previously seen patterns.

Learning like a human

The current research takes advantage of various research on human learning, such as Bloom’s Taxonomy, developed by Benjamin Bloom in the mid-1950s to characterize learning objectives across different levels of complexity and specificity. In the cognitive domain, this includes remembering, understanding, applying, analyzing, evaluating, and creating. They collected data to test across these different levels and multi-level comprehension. Here are some examples across different levels:

  • Remember: Recall facts and basic concepts.
  • Understand: Explain ideas or concepts.
  • Apply: Use the information in new situations.
  • Analyze: Draw connections among ideas.
  • Evaluate: Justify a stand or conclusion.
  • Create: Produce new or original work. 

In this early research, they focused on improving results on the analysis of picture books designed to educate young children in reading comprehension. Then, they developed various questions and answer data sets to test responses across multiple levels. 

Graphs connect the dots

This approach helped the team develop a story graph that characterizes relationships between people, places, and events across different levels of abstraction. Previously, researchers conceived of a related concept called a scene graph for describing the geometric relationship between objects in a picture. The story graph model extends this core idea to cause-effect, temporal, and logical relationships. A better graph model for representing data can help automate the process of augmenting human-curated data sets to new questions and results. 

They also applied this new story graph model to automate the process of training a new Large Vision Language Model (LVLM) called DRESS. Early research found it could generate more helpful, honest, and harmless responses and more effectively learn from feedback compared to existing approaches. 

Here is an example of how this might work in practice. If an LLM hallucinated that San Francisco is 50 miles from New York City, that would be blatantly wrong. However, this result would allow developers to test the consistency between the response to additional questions associated with the starting one to assess the soundness of the reasoning and how grounded it is in the details. For example, it might suggest you could travel there in an hour or even take a taxi. 

Conversely, the LLM might respond with a factually correct response, but the reasoning would be completely wrong. For example, it might say that milk consumption in the US increased during the pandemic. But if a follow-up response says the increased milk consumption caused the pandemic, that is unsound reasoning. 

Divakaran said the current challenges are around developing and testing their ideas at scale. They have developed lightweight techniques that reduce the need for heavy computation. Down the road, they plan to flesh out their ideas on semantic consistency, automatically augmenting training data, and RLAIF to develop lightweight add-on technologies that others could bolt onto off-the-shelf models. They also want to extend these ideas into content generation rather than just content analysis. 

He says:

In fact we want to develop technology that democratizes large models so that anyone can use them reliably.

My take

There are a few things that stand out for me. First, most of the existing research has focused on whether AI hallucinates or not. However, there can be vast differences in the rate of hallucination depending on whether you are seeking to retrieve facts, explain concepts, or draw a connection between related data. Techniques like retrieval augmented generation that combine traditional search engines and LLMs can do a better job at helping them get the facts right. However, other tools and processes for refining results across multiple levels will be required for further progress. 

Second, the next big wave of generative AI innovation will be around multi-modal AI that can answer questions across different types of data. Initially, this will lie in combining text and images. But increasingly, enterprises will want to use these tools to make sense of information from across ERP, CRM, IoT, and streaming data sources. This will require further research to ensure these systems provide front-line employees and customers with helpful, accurate, and safe results. 

Third, this will require new graph data structures to highlight the connections across different contexts and types of questions. Developing a framework for understanding stories seems like a good start. However, extending these to represent the different levels of understanding of experts like doctors, engineers, and supply chain analysts when considering different types of data will take a bit more work. Perhaps in the near future, this could lead to innovations in medical graphs, product lifecycle graphs, and supply chain graphs to reduce hallucinations across these other domains. 

Loading
A grey colored placeholder image