Generative AI has dazzled the world with its ability to produce human-like text, automate customer service, and even draft legal documents. It has limitless possibilities, but when it’s left to operate independently, problems can quickly occur.
Behind the excitement lie cautionary tales of when AI systems operate without adequate human oversight. Because AI can formulate responses, it can share incorrect information. From false promises to hallucinated information, businesses are learning the hard way that leaving AI unchecked can lead to costly, even legally binding, mistakes.
This post examines several post-2022 failures of generative AI in business contexts–and distills lessons on how to keep humans in the loop to avoid similar pitfalls.
Our first example is of a chatbot promising a refund that didn’t exist. When Jake Moffatt’s grandmother passed away, he used Air Canada’s AI chatbot to ask about bereavement fare rules. Unfortunately, the bot said he could book a ticket at full price and later requested a refund within 90 days, which did not match Air Canada’s refund policies.
Moffatt took Air Canada to a small claims tribunal, where the airline argued that the chatbot was a “separate legal entity” outside the airline’s control. The tribunal noted that Air Canada gave no reason why customers should not trust the chatbot and ruled in Moffatt’s favor: Air Canada was ordered to pay him a refund of about C$650 plus interest and fees.
This case underscores a pivotal point: if your AI agent speaks on your company’s behalf, your company is responsible for what it says. The airline had to disable the faulty bot and eat the costs of its “advice,” along with a healthy dose of public embarrassment.
In the legal industry, a headline-grabbing fiasco showed how generative AI can lead professionals astray. In mid-2023, two New York lawyers filed a brief in federal court that cited six precedent-setting court cases–all of which were fake and completely fabricated by ChatGPT.
The outcome was disastrous. The opposing counsel and judge quickly discovered that these cases were mythical. U.S. District Judge P. Kevin Castel sanctioned the lawyers and fined their firm $5,000 for acting in bad faith and making “false and misleading statements to the court.”
This “hallucination” (as AI falsehoods are often called) could have been caught with basic human due diligence. Instead, the lawyers learned the hard way that ChatGPT and similar large language models are notorious for their tendency to concoct plausible-sounding, false information. In a business as detail-dependent and high-stakes as law, trusting an unverified AI output led to professional humiliation and tangible penalties.
Media and publishing have also experimented with generative AI in business and faced backlash when machines went unsupervised. In early 2023, tech site CNET revealed it used AI to write dozens of articles on personal finance. The appeal was obvious: pumping out content at scale with minimal human labor. Once readers and journalists scrutinized these AI-written articles, they found the stories were riddled with factual errors and plagiarism. Generative AI was directly spreading misinformation to consumers in a domain where accuracy is paramount.
CNET’s leadership quickly issued lengthy corrections, added editor’s notes warning readers, and paused the entire AI content program. This is a warning for businesses: automating content creation will backfire without rigorous editorial control. An AI writer doesn’t truly “understand” the facts it discusses; it just stitches together words that statistically sound right, which means it can confidently assert falsehoods. If those falsehoods go live on your site, your credibility and legal liability are on the line.
It’s not just employees or content creators who get into trouble–sometimes, the AI product itself is flawed. DoNotPay, “the world’s first robot lawyer,” claimed it could handle an array of legal tasks without needing a human attorney. Unfortunately, in 2023, the company faced complaints that its AI was producing form letters and legal advice riddled with errors.
The U.S. Federal Trade Commission launched an investigation, and by early 2025, the FTC announced a settlement order. According to the FTC’s complaint, the company had never bothered to test whether its AI’s outputs met the standards of a licensed attorney, nor did it hire legal experts to ensure the AI’s advice was sound. Under the FTC’s order, DoNotPay must pay $193,000 in refunds and is barred from claiming its AI is as good as a human lawyer without evidence backing that up.
This case illustrates a broader lesson: if you’re selling an AI solution, overpromising its abilities is a recipe for legal and reputational disaster. Complex fields like law, medicine, or finance still require human expertise and oversight.
Among all industries, finance suffers heavily from generative AI mistakes in the form of lost money or regulatory violations. Perhaps that’s why, despite the hype, banks and financial firms banned employees from using ChatGPT in late 2022. In such a tightly regulated business, even a minor slip-up can be extremely costly, and ChatGPT’s issues with factual accuracy make it unsuitable for unsupervised use.
Financial regulators are equally wary. In 2023, the U.S. Consumer Financial Protection Bureau warned that “when chatbots provide inaccurate information regarding a consumer financial product or service, there is potential to cause considerable harm.” A bot that gives the wrong payoff amount on loan or misstates an APR could have significant repercussions for the customer violate consumer protection laws and trigger lawsuits or fines.
Even tech giants have felt the sting of AI errors. When Google’s new Bard chatbot flubbed a factual question in its first demo, Alphabet’s stock value plunged by $100 billion in a single day as investors panicked. That wasn’t even a live product yet, but it shows how high the stakes are when your AI makes a public mistake.
We don’t have to stretch our imagination to see the risks. Picture an AI-powered investment advisor mistakenly telling clients that a particular stock is “guaranteed” to rise or a trading algorithm executing orders based on a news headline the AI fabricated. Such scenarios could cost a firm millions.
To avoid disasters, most financial institutions are treading carefully. Those deploying generative models do so in tightly controlled environments with constant human oversight. Industry analysts stress that today’s generative AI for business models, while impressive, “don’t know what they don’t know” and can fail unpredictably when pushed into novel scenarios. As one report put it, these models are “designed to sound convincing as if there were solid reasoning…In fact, there is no individual agency within and less than perfect data.”
The real-world cases above drive home a common theme: generative AI for business systems needs human supervision, especially in high-stakes business contexts. Here are some practical takeaways for organizations implementing generative AI:
Generative AI is transforming business operations, but these technologies are far from infallible. As we’ve seen, an unsupervised AI can as easily produce a catastrophic falsehood as a brilliant insight. The difference between the two often comes down to whether a human is in the loop to verify and guide the AI’s actions. Companies that blindly trust generative AI outputs, or, worse, allow an AI to make promises on their behalf, risk financial loss, legal liability, and reputational damage. On the other hand, organizations that harness generative AI in businesses power while maintaining human oversight can reap the benefits and mitigate the risks. The equation is simple: Human + AI > AI alone. By coupling the creativity and efficiency of generative AI with the judgment and accountability of human experts, businesses can innovate safely and responsibly in the new AI era.