Controversy has erupted because AI Overview can generate false information on various topics, including historical facts, health advice, and cooking instructions. In some instances, it has provided dangerous advice, such as recommending adding harmful substances to food. Part of the problem is due to the AI's reliance on web sources, which can sometimes contain low-quality or satirical content. One example is that AI Overview recommended eating at least one rock a day for vitamins and minerals. The advice was scraped from The Onion, a satirical website.
With respect to health-related searches, several types of problems have arisen:
- Inaccurate Health Advice: The AI provided health advice that contradicted medical guidance or posed potential health risks. For example, recommendations that might discourage users from seeking urgent medical care or misinform them about symptoms and treatments.
- Confusing Source Attribution: AI Overview generated answers that mixed information from reputable medical sources and commercial websites without clear distinctions. In response to whether chocolate is healthy, the AI combined data from Johns Hopkins Medicine with information from Venchi, a chocolate company.
- Misleading Answers on Serious Conditions: AI responses to serious health questions like "Am I having a heart attack?" were buried under long lists of symptoms, delaying critical advice to call emergency services. For stroke symptoms, the AI correctly advised calling 911 immediately, but for heart attack symptoms, this urgent advice was not as prominent.
Google has acknowledged these issues, but claims that most AI Overviews provide high-quality information. The company is working to improve accuracy and remove harmful content, but challenges remain. AI Overview is powered by a large language model (LLM) similar to the one that underlies ChatGPT and other chatbots.
LLMs are prone to hallucinate at low frequency (i.e. fabricate false information), they can exhibit defects in reasoning (i.e. tricked by simple brain teasers), and sometimes cannot synthesize information across modalities (i.e. text and images). These problems go beyond the medical domain to all tasks performed by LLMs.
Some improvements being considered include filtering the input, and flagging trolling or nonsensical inputs, which can have a separate response; filtering the output to makes sure it meets appropriate medical standards in terms of providing quality references or placing the most critical information at the top; and finally improving the LLM itself to reduce hallucinations and other types of inaccuracies and errors.
AI Overview collaborates with Google Knowledge Graph, which is Google's own massive knowledge base. It's the engine behind many of the rich search results and intelligent features we see in Google Search and other Google products. More specifically, Google Knowledge Graph contains billions of entities (i.e. people, places, things, ideas, events, and more), and trillions of relationship between entities, specifying how they connect to each other.
In a previous post, I described how traditional Google search leveraged Google Knowledge Graph to understand health queries, and to formulate the health knowledge panels that appear on the right-side of the health-related search results (see Figure 1). These panels may include information on symptoms, causes, treatments, potential risks, and related conditions (along with an illustration). Now, the panels are still there, but much of the information may be incorporated into the AI Overview, and this information can help to ground the LLM response in more factual content.
Overall, this new feature can be seen as part of the overall evolution of "Dr. Google". As a reminder, roughly 1 in 20 Google searches are health-related. Just like chatbots are capable of answering run-of-the-mill medical questions from patients in place of doctors (QH), AI Overview can be used to answer sundry health questions from billions of Google users world-wide.
These are clearly bumps in the road but AI Overview is most likely here to stay. One active area of research in the machine learning field is the collaboration between LLMs (like AI Overview) and knowledge graphs (KGs) like Google Knowledge Graph. They complement each other's strengths, but their interaction is not always seamless. Here is a breakdown of how LLMs can benefit from KGs:
- Enhanced Factual Accuracy: LLMs can hallucinate information, generating text that sounds plausible but is factually incorrect. KGs, with their structured, curated data, can be used to ground the LLM's outputs in facts and improve their reliability.
- Reasoning and Inference: LLMs struggle with complex reasoning tasks that require traversing relationships between entities. KGs, with their explicit relationships, can facilitate logical deductions and provide contextual information for better reasoning.
- Structured Information Extraction: LLMs can be used to extract information from unstructured text, but structuring this information can be challenging. KGs provide a framework for organizing extracted information into a meaningful and usable format.
Thanks to these benefits and further tweaks, one can expect the performance of AI Overview to improve especially on medical topics so that Dr. Google can more directly answer your health-related questions in a timely, relevant, and accurate fashion.
Figure 1. A Google search of Measles now presents an AI Overview generated from a Large Language Model (LLM) at the top of the page with the links listed below. The knowledge panel is still to the right. There is some overlap in the information in the AI Overview with the information in the knowledge panel because both both depend on the Google Knowledge Graph.

No comments:
Post a Comment