News > Smart & Connected Life How Hallucinations Could Help AI Understand You Better Visual clues can aid translations By Sascha Brodsky Sascha Brodsky Senior Tech Reporter Macalester College Columbia University Sascha Brodsky is a freelance journalist based in New York City. His writing has appeared in The Atlantic, the Guardian, the Los Angeles Times and many other publications. lifewire's editorial guidelines Published on June 22, 2022 10:32AM EDT Fact checked by Jerri Ledford Fact checked by Jerri Ledford Western Kentucky University Gulf Coast Community College Jerri L. Ledford has been writing, editing, and fact-checking tech stories since 1994. Her work has appeared in Computerworld, PC Magazine, Information Today, and many others. lifewire's fact checking process Tweet Share Email Tweet Share Email Smart & Connected Life AI & Everyday Life News A new machine learning model hallucinates an image of a sentence's appearance in a language to aid translation.The AI system, called VALHALLA, was designed to mimic the way humans perceive language. The new system is part of a growing movement to use AI to understand language. Oscar Wong / Getty Images The human method of visualizing pictures while translating words could help artificial intelligence (AI) understand you better. A new machine learning model hallucinates an image of what a sentence looks like in a language. According to a recent research paper, the technique then uses visualization and other clues to aid with translation. It’s part of a growing movement to use AI to understand language. "How people talk and write is unique because we all have slightly different tones and styles," Beth Cudney, a professor of data analytics at Maryville University, who was not involved in the research, told Lifewire in an email interview. "Understanding context is difficult because it is like dealing with unstructured data. This is where natural language processing (NLP) is useful. NLP is a branch of AI that addresses the differences in how we communicate using machine reading comprehension. The key difference in NLP, as a branch of AI, does not focus simply on the literal meanings of the words we speak or write. It looks at the meaning." Go Ask Alice The new AI system, called VALHALLA, created by researchers from MIT, IBM, and the University of California at San Diego, was designed to mimic the way humans perceive language. According to scientists, using sensory information, like multimedia, paired with new and unfamiliar words, like flashcards with images, improves language acquisition and retention. These systems are increasing the power of chatbots that are currently only trained and capable of specific conversations... The team claims their method improves the accuracy of machine translation over text-only translation. The scientists used an encoder-decoder architecture with two transformers, a type of neural network model suited for sequence-dependent data, like language, that can pay attention to keywords and semantics of a sentence. One transformer generates a visual hallucination, and the other performs multimodal translation using outputs from the first transformer. "In real-world scenarios, you might not have an image with respect to the source sentence," Rameswar Panda, one of the research team members, said in a news release. "So, our motivation was basically: Instead of using an external image during inference as input, can we use visual hallucination—the ability to imagine visual scenes—to improve machine translation systems?" AI Understanding Considerable research is focused on advancing NLP, Cudney pointed out. For example, Elon Musk co-founded Open AI, which is working on GPT-3, a model that can converse with a human and is savvy enough to generate software code in Python and Java. Google and Meta are also working to develop conversational AI with their system called LAMDA. "These systems are increasing the power of chatbots that are currently only trained and capable of specific conversations, which will likely change the face of customer support and help desks," Cudney said. Aaron Sloman, the co-founder CLIPr, an AI tech company, said in an email that large language models like GPT-3 can learn from very few training examples to improve on summaries of text based on human feedback. For instance, he said, you can give a large language model a math problem and ask the AI to think step-by-step. "We can expect greater insights and reasoning to be extracted from large language models as we learn more about their abilities and limitations," Sloman added. "I also expect these language models to create more human-like processes as modelers develop better ways to fine-tune the models for specific tasks of interest." Georgia Tech computing professor Diyi Yang predicted in an email interview that we will see more use of natural language processing (NLP) systems in our daily lives, ranging from NLP-based personalized assistants to help with emails and phone calls, to knowledgeable dialogue systems for information-seeking in travel or healthcare. "As well as fair AI systems that can perform tasks and assist humans in a responsible and bias-free manner," Yang added. Enormous AI models using trillions of parameters such as GPT-3 and DeepText will continue to work towards a single model for all language applications, predicted Stephen Hage, a machine learning engineer at Dialexa, in an email interview. He said that there will also be new types of models created for specific uses, such as voice-commanded online shopping. "An example might be a shopper saying 'Show me this eyeshadow in midnight blue with more halo,' to show that shade on the person's eyes with some control over how it's applied," Hage added. Was this page helpful? Thanks for letting us know! Get the Latest Tech News Delivered Every Day Subscribe Tell us why! Other Not enough details Hard to understand Submit