Google’s AI Chatbot Bard Fails To Answer Basic US College Exam Questions: Report
Google’s artificial intelligence (AI) chatbot Bard has been making headlines since its release. Alphabet lost $100 billion in market value after its new chatbot shared inaccurate information in a promotional video at a company event in early February. As per a report in Fortune, Sundar Pichai, the CEO of the tech company, seems relaxed about how far the company’s artificial intelligence models need to advance. He said in a letter to the entire staff that Bard is still in its early development: “As more people start to use Bard and test its capabilities, they’ll surprise us. Things will go wrong.” Bard is now being tested by the general public, whereas before it was primarily being used by Google employees.
Fortune recently tested the AI chatbot’s knowledge ahead of the upcoming SATs, a standardised test which is widely used for college admissions in the United States. The exam mainly tests skills including reading, writing and mathematics.
However, the outlet noted that the once they logged in, a message appeared, “Bard will not always get it right. Bard may give inaccurate or inappropriate responses. When in doubt, use the ‘Google it’ button to check Bard’s responses. Bard will get better with your feedback. Please rate responses and flag anything that may be offensive or unsafe.”
Fortune obtained sample SAT maths questions from internet study materials and discovered that Bard answered anywhere between 50 per cent and 75 per cent of them incorrectly-even when multiple-choice solutions were offered. When the same question was posed again, it provided answers that were even not multiple-choice options.
When Bard was launched, it was tasked with answering several questions, one of which was how to explain to a nine-year-old what the James Webb Space Telescope has found. Despite NASA’s confirmation that the Very Large Telescope in Chile, a ground-based array, obtained the first image of an exoplanet in 2004 and identified it as such in 2005, Bard responded that the telescope got the “very first pictures of a planet outside of our own solar system.”
Furthermore, Bard’s first written language test with Fortune came back with around 30 per cent correct answers, often needing to be asked the questions twice for it to understand.
Even when the answer was wrong, “Bard’s tone is confident” as it frequently framed responses as “The correct answer is”-which is a common feature of large language models, as per the outlet.
In reading tests, Bard fared better than it did in maths, getting about half the answers right.
All in all, Bard scored 1200 points, a score that would get a student into the likes of Howard University, San Diego State University and Michigan State University.
A Google spokesperson told Fortune “Bard is experimental, and some of the responses may be inaccurate, so double-check information in Bard’s responses. With your feedback, Bard is getting better every day. Before Bard launched publicly, thousands of testers were involved to provide feedback to help Bard improve its quality, safety, and accuracy.
“Accelerating people’s ideas with generative A.I. is truly exciting, but it’s still early days, and Bard is an experiment. While Bard has built-in safety controls and clear mechanisms for feedback in line with our A.I. Principles, be aware that it may display inaccurate information.”