AI hallucinations and can influence search results and other AI, creating a dangerous feedback loop

Why it matters: Since the emergence of generative AI and large language models, some have warned that AI-generated output could eventually influence subsequent AI-generated output, creating a dangerous feedback loop. We now have a documented case of such an occurrence, further highlighting the risk to the emerging technology field.

While attempting to cite examples of false information from hallucinating AI chatbots, a researcher inadvertently caused another chatbot to hallucinate by influencing ranked search results. The incident reveals the need for further safeguards as AI-enhanced search engines proliferate.

Information science researcher Daniel S. Griffin posted two examples of misinformation from chatbots on his blog earlier this year concerning influential computer scientist Claude E. Shannon. Griffin also included a disclaimer noting that the chatbots’ information was untrue to dissuade machine scrapers from indexing it, but it wasn’t enough.

Griffin eventually discovered that multiple chatbots, including Microsoft’s Bing and Google’s Bard, had referenced the hallucinations he’d posted as if they were true, ranking them at the top of their search results. When asked specific questions about Shannon, the bots used Griffin’s warning as the basis for a consistent but false narrative, attributing a paper to Shannon that he never wrote. More concerningly, the Bing and Bard results offer no indication that their sources originated from LLMs.

Oops. It looks like my links to chat results for my Claude Shannon hallucination test have poisoned @bing. pic.twitter.com/42lZpV12PY

– Daniel Griffin (@danielsgriffin) September 29, 2023

The situation is similar to cases where people paraphrase or quote sources out of context, leading to misinformed research. The case with Griffin proves that generative AI models can potentially automate that mistake at a frightening scale.

بخونید: اپل در حال کار بر روی تراشه های داخلی است تا وابستگی خود به کوالکام و برادکام را کاهش دهد

Microsoft has since corrected the error in Bing and hypothesized that the problem is more likely to occur when dealing with subjects where relatively little human-written material exists online. Another reason the precedent is dangerous is that it presents a theoretical blueprint for bad actors to intentionally weaponize LLMs to spread misinformation by influencing search results. Hackers have been known to deliver malware by tuning fraudulent websites to attain top search result rankings.

The vulnerability echoes a warning from June suggesting that as more LLM-generated content fills the web, it will be used to train future LLMs. The resulting feedback loop could dramatically erode AI models’ quality and trustworthiness in a phenomenon called “Model Collapse.”

Companies working with AI should ensure training continually prioritizes human-made content. Preserving less well-known information and material made by minority groups could help combat the problem.

منبع