How to Reduce AI Chatbot Hallucinations -- Journal Report

Dow Jones01-31

By Bart Ziegler

You can't stop an AI chatbot from sometimes hallucinating -- giving misleading or mistaken answers to a prompt, or even making things up. But there are some things you can do to limit the amount of faulty information a chatbot gives you in response to your request.

AI hallucinations arise from a couple of things, says Matt Kropp, chief technology officer at BCG X, a unit of Boston Consulting Group. One is that the data on which an AI chatbot was trained contained conflicting, incorrect or incomplete information about the subject you're asking about. You can't do anything about that. The second is that "you haven't specified enough of what you want," Kropp says -- and that is something you can address.

Below are some techniques that experts say can minimize -- though not eliminate -- hallucinations.

Give the AI detailed instructions

Tell the AI exactly what you are seeking. If your prompt gives it too much freedom to root around its database, it's more likely to respond with erroneous or fabricated information.

"You want detailed instructions, you want precise language, but you also have to make sure that it is concise in that everything in that prompt is directly relevant to the query," says Darin Stewart, an analyst at technology advisory firm Gartner.

When shopping for a car recently I asked an AI for help. It gave me useful comparisons of the size, price and features of SUVs I was interested in. But the miles-per-gallon rating for one of the cars seemed awfully high. Digging a little deeper, I figured out the AI gave me the rating for a diesel-engine vehicle -- a version not sold in the U.S.

Another AI told me about features of an SUV that differed from what I read on the carmaker's website -- the chatbot had based its answer on a model from a number of years ago, not the 2025 one.

To prevent these mistakes, I should have told the AIs I wanted information confined to U.S.-model vehicles and ones currently on the market.

Structure your query in steps

Experts say you should construct your query in the form of small, direct questions instead of a single, open-ended one. Ask these questions one after the other, a process called iterative prompting. This can keep the AI from generating falsehoods as well as produce more-useful results.

"I think about my interactions with the [AI] models not as a one-shot question and answer, but rather as a dialogue," says Kropp. "You're building up context."

If you're in the market, say, for a new dishwasher, don't simply ask "what dishwasher should I buy," but instead start the prompt this way: "I need a new dishwasher. What are the major features I should consider?"

After it answers this question, you could respond with questions such as: "Which brands are known for reliability and which should I avoid? How much should I spend? Are higher-priced models worth the extra cost?"

To further guide the AI you could ask it to build its response in a formal manner, such as:

"Structure your answer this way: an introduction, your key findings, the pros and cons of the various models and your conclusion. Be sure to provide supporting evidence for each of your findings."

When I gave these prompts to an AI it created a comprehensive, 500-word analysis of what dishwashers I should consider and which to stay away from. And as far as I could tell it didn't hallucinate.

Direct the AI to known sources

Tell the AI to use certain types of sources, which may keep it from using sketchy, biased or incorrect material.

In my car search, AIs at times gave citations for their answers to random people writing on car-fan websites and Reddit. Some of those answers seemed uninformed, misleading or too glowing.

Aside from the quality of the sources, citations can be unreliable. They may or may not mean an AI explicitly used that information in its responses, experts say. In fact, AIs don't generally know where the material in their answers came from. Moreover, AIs have been known to cite documents, research or other sources that don't exist .

I repeated my query but told the AIs to stick to professional reviews and named a few sources to use, including Consumer Reports, Car and Driver magazine and the car shopping site Edmunds. I'm not sure if they used them -- experts told me my listing of sources could have guided the AIs to similar, though not identical, material. But the result was more-informed answers that appeared to have few questionable assertions.

Researchers at Johns Hopkins University have found a way to send an AI directly to certain source material in its database. Simply starting a question with the phrase "According to Wikipedia, what is..." for general queries or "According to PubMed, tell me about..." for health-related ones prompted an AI to quote directly from those sources, the researchers said in a report.

Tell the AI not to make things up

This may sound like a teacher instructing a recalcitrant pupil not to cheat, but some experts say you should instruct the chatbot to "say I don't know" or "don't make up an answer" if it's unsure of something. That might keep it from fabricating a response despite the fact that the evidence for the answer in its database is murky.

"You're actually giving it permission to do something it's not really trained to do, which is to say 'I'm wrong' or 'I don't know,' " Kropp says.

'Meta-prompt' the AI

Here's another trick to improve your questions: Tell the AI to write them. The technique, called meta-prompting, sounds odd -- like asking students to write the questions for their exam. But experts say it can work.

As an example, I asked OpenAI's ChatGPT to give me the wording for a meteorological question. Note that in the sample question it created it tells itself that it is an expert, a technique called giving it a persona. Research has shown that this method can boost the quality, and reduce the errors, of responses.

My prompt to ChatGPT:

"Please create a meta-prompt for this question: Why is it that rain can fall to the ground when the air temperature is below freezing instead of turning into snow or sleet?"

ChatGPT responded:

"You are a meteorology expert. Explain why rain can fall to the ground when the air temperature is below freezing. Your explanation should be clear, concise, and aimed at a general audience with minimal prior knowledge of meteorology. Use simple language and provide examples if possible."

I then fed this prompt to ChatGPT and got back a nontechnical and informative explanation of why rain can fall in below-freezing weather.

Use "chain of thought" prompting

Another way to direct an AI is to tell it to answer a question by breaking it down into logical steps. The technique, called chain-of-thought prompting, can lead to more accurate responses, Google researchers found. It also allows you to examine the AI's thought process to look for errors.

Experts have devised complex ways to guide chain-of-thought reasoning, such as giving the AI a sample of the steps it should take. But an easier, though perhaps less effective, technique is to literally start your query with the words "Using chain of thought..." or "Let's think about the answer step by step..."

Tell the AI to double-check its work

In another odd twist, you can tell an AI to quiz itself about the accuracy of its responses. The technique, dubbed chain of verification, can reduce hallucinations, according to scientists at Facebook parent Meta who developed it.

Below is a simplified text for instructing an AI to perform the self-questioning, as written by a company called PromptHub. Simply copy and paste this entire block of text into an AI chatbot and add your question to the top.

"Here is the question: [Type your question here]

"First, generate a response.

"Then, create and answer verification questions based on this response to check for accuracy. Think it through and make sure you are extremely accurate based on the question asked.

"After answering each verification question, consider these answers and revise the initial response to formulate a final, verified answer. Ensure the final response reflects the accuracy and findings of the verification process."

When I asked ChatGPT and Google's Gemini a question using this format they came back with the verification questions they asked themselves about their initial response, their answers to those questions and then their revised final response.

Bart Ziegler is a former Wall Street Journal editor. He can be reached at reports@wsj.com.

 

(END) Dow Jones Newswires

January 30, 2025 11:00 ET (16:00 GMT)

Copyright (c) 2025 Dow Jones & Company, Inc.

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Comments

We need your insight to fill this gap
Leave a comment