Microsoft said that more than one million users signed up to check out its tool in the first 48 hours during last week’s chatbot hoopla when the two companies were trying to outdo one other in demonstrating early versions of artificial intelligence-powered search.
Satya Nadella, CEO of Microsoft, recently told CNBC that the technology was “perhaps the industrial revolution brought to knowledge work” since it can generate fully formed replies that seem like they were written by a human.
However, AI has a lot of room for improvement in terms of accuracy. In a demonstration for the press, Microsoft’s Bing search engine’s ChatGPT-like technology examined earnings data from Gap and Lululemon. When tested against the actual reports, the chatbot came up short on a few key figures. Some of them seem completely fabricated.
“Bing AI got some answers completely wrong during their demo. But no one noticed,” wrote independent search researcher Dmitri Brereton in a Substack post on Monday. “Instead, everyone jumped on the Bing hype train.”
In addition to the monetary inaccuracies, Brereton also noticed possible factual flaws in the Microsoft presentation’s comments about vacuum cleaner specifications and vacation plans to Mexico. He told CNBC he wasn’t actively seeking for mistakes but found some when doing research for an article contrasting Microsoft and Google’s AI presentations.
If you want to learn more about Microsoft, feel free to review the following articles we’ve written in the past:
- Microsoft Unveils Enhanced ChatGPT AI-Powered Bing and Edge Browsers.
- Microsoft Exposes Sony’s Lies Accused of Misleading EU Regulators on Activision Deal.
- Microsoft Issues Urgent Patching Notice for on-premises Exchange Servers!
Artificial intelligence researchers use the term “hallucination” to describe the tendency of tools built on top of huge language models to invent information on the fly. The artificial intelligence technology unveiled by Google last week also had inaccuracies, although those were swiftly pointed out by users.
After the success of OpenAI’s ChatGPT, which was released to the public in November, both companies are racing to add new types of generative AI into search engines and are keen to exhibit their accomplishments. Both OpenAI and its competitors, Stability AI and Hugging Face, have raised billions from Microsoft, and both have been valued at over a billion dollars in private fundraising rounds.
Microsoft’s announcement last week highlighted the short-term possibility of releasing the technology to a portion of the public, although Google has been hesitant to include AI-generated responses in search engines, citing reputational damage and safety concerns.
“I think it’s important not to be in a lab,” Nadella said. “You have to get these things out safely.” In the demo of Bing AI’s response to a question about a company’s earnings, there were a few hiccups.
Microsoft CEO Yusuf Mehdi went to the investor relations site for Gap and requested Bing AI to highlight the “key takeaways” from the company’s third-quarter earnings announcement in November.
“Very cool. A massive time savings,” Mehdi said.
Screenshots from Microsoft’s demo:
Several errors can be found in the following portion of the summary:
The company’s gross margin was stated to be 37.4%. When Yeezy-related costs were taken out of the equation, the adjusted gross margin increased to 38.7 percent.
The report doesn’t include the figure, but the gap operating margin was 4.6%, not 5.9%. The report’s adjusted diluted EPS is $0.71, not $0.42. Adjusted income tax benefits of around $0.33 were factored into the amount stated by Gap.
In its third-quarter report, Gap indicated that “net sales could be down mid-single digits year-over-year in the fourth quarter.” This followed the company’s August withdrawal of its full-year outlook. Instead of “growth in the low double digits,” that would indicate a loss in sales for the entire year. Operating margin and earnings per share are not anticipated. According to Microsoft, it is aware of the problems and mistakes made by Bing AI are to be expected.
“We’re aware of this report and have analyzed its findings in our efforts to improve this experience,” a Microsoft spokesperson told CNBC. “We recognize that there is still work to be done and are expecting that the system may make mistakes during this preview period, which is why the feedback is critical so we can learn and help the models get better.”
Microsoft then requested that Bing AI evaluate how Gap’s financials stacked up against Lululemon’s report. Mehdi requested that Bing compile the data found in the two reports into a single table.
“Look how amazing this is,” he said. “Just like that, in one table, I can get an answer to this question. Think how much time that would’ve taken otherwise.”
What the Bing AI tool found is as follows:
If you use Twitter, you may follow us to be the first to know about breaking technology news as it happens.