Notes 🗒️

Air Canada had to refund a customer after its customer service AI hallucinated

Olivier Simard-Casanova

Apr 3, 2024 — 2 min read

In February 2024, a Canadian court forced Air Canada to partially refund the airline ticket purchased by one of its customers. The customer had asked the customer support AI bot under which conditions Air Canada reimburses emergency-purchased airline tickets when a loved one dies. The bot responded by hallucinating a refund policy that did not exist.

Air Canada offered the customer a voucher for 200 Canadian dollars, but the customer refused and took the case to court. Air Canada fought the complaint, arguing that the customer should have referred to the pages explaining the refund policy rather than the bot's words. Clearly, the judge was not receptive to the argument, and they considered that the bot's words have the same legal value as pages written by humans.

Following this ruling, Air Canada seems to have stopped its bot, despite its significant set up cost. The bot's goals were to reduce customer service costs and improve service quality.

From my point of view, what is interesting in this anecdote is the unreasonable trust Air Canada placed in its bot. It has been documented, including in the scientific literature, that AIs based on large language models such as ChatGPT are prone to hallucinations. They tend to invent information that does not exist. However, Air Canada seems to have acted as if this documented limitation did not exist, or was minor enough not to pose a problem.

On Threads, Gergely Orosz mentions a similar anecdote in these two posts:

I enjoy hearing companies use GenAI / LLMs as experiments (that can fail!) to improve developer productivity.

Lately, I'm hearing more stories of even large companies where leadership is treating it as a (desperate) solution that must succeed in increasing productivity.

Like there's ~$10B company, losing money big time, where they are pushing devs to dump what they know into the wiki; and hope their internal LLM can scoop it up and e.g. launch new features in new regions, autonomously, and without the need to have a dev involved.

Ugh.

Anyone who has ever asked a generative AI to generate computer code knows that the code generated has to be meticulously reviewed. Often, the code looks like running code, but it does not run. Or the code runs, but does not do what it is supposed to do.

As with Air Canada and its customer service bot, it is wishful thinking to believe that a technology so prone to hallucination can be used to develop new features without human intervention. In the future, generative AIs may be able to write reliable-enough code, or not hallucinate refund policies that do not exist. But in its current form, the technology is not capable of that.

Generative AIs deserve better than moral panics. But neither do they deserve to be treated as miracle solutions, ignoring their well-documented limitations. The risk is to make costly mistakes that are easy to avoid. I suspect that Air Canada will not be the only organization to make this kind of mistake.

In the United States, Threads already has more daily users than X

Matt Navarra on Threads, citing a Business Insider article, reports that in the United States, Threads already has more daily users than X, formerly Twitter. In April 2024, Threads has 28 million daily active users, X has 22 million. Threads has 100 million monthly active users, X has 140 million.

2024 has already broken multiple temperature records

Carbon Brief has published a comprehensive article containing multiple visualizations of global temperature data. The visualizations show that 2024 has already broken multiple temperature records. As shown in Figure 1, the first four months of 2024 have seen the largest temperature anomaly since the 1940s. The temperature anomaly is the

Joe Biden's campaign is raising a lot more money than Donald Trump's campaign

Figure 1 is an interesting plot shared on Bluesky by Jacob T. Levy. It shows the cumulative total of funds raised by the campaigns of Donald Trump (in red) and Joe Biden (in blue), in 2020 (dotted line) and 2024 (solid line), between 600 days before Election Day and Election

Threads likely to show ads as soon as the second half of 2024

As a reminder, Threads is Meta's Instagram-backed Twitter competitor. According to Digiday, Meta has been promoting Threads' upcoming advertising features to a selection of executives working in the advertising industry. The advertising features would arrive in the second half of 2024, a bit more than a year

Read more

In the United States, Threads already has more daily users than X

2024 has already broken multiple temperature records

Joe Biden's campaign is raising a lot more money than Donald Trump's campaign

Threads likely to show ads as soon as the second half of 2024