Insights from data and machine learning algorithms can be invaluable. However, it’s also important to understand your tools, know your data, and keep your organization’s values firmly in mind. This is because such data mistakes can cost you reputation, revenue, or even lives. Here are a handful of high-profile analytics and AI blunders from the past decade to illustrate what can go wrong.
Loss of thousands of COVID-19 cases by exceeding spreadsheet data limit
The UK government body responsible for tallying new COVID-19 infections, revealed that nearly 16,000 coronavirus cases went unreported between Sept. 25 and Oct. 2. The culprit? Data limitations in Microsoft Excel.
The UK government used an automated process to transfer COVID-19 positive lab results as a CSV file into Excel templates used by reporting dashboards and for contact tracing. Unfortunately, Excel spreadsheets can have a maximum of 1,048,576 rows and 16,384 columns per worksheet. Moreover, PHE was listing cases in columns rather than rows. Therefore, when the cases exceeded the column limit, Excel cut off the all records at the bottom. Although this technical “glitch” didn’t prevent individuals who got tested from receiving their results, but it did stifle contact tracing efforts.
Dataset trained chatbot to spew racist tweets
Back in March 2016, Microsoft learned that using Twitter interactions as training data for machine learning algorithms may not be the best idea after all. Tay, an AI chatbot, was released on the social media platform by Microsoft. The idea was the chatbot would assume the persona of a teen girl and interact with individuals via Twitter using a combination of machine learning and natural language processing. However, little did they expect that within 16 hours, the chatbot posted more than 95,000 tweets, and those tweets rapidly turned overtly racist, misogynist, and anti-Semitic.
AI-enabled recruitment tool only recommended men
Like many large organisations out there, Amazon is always on the hunt for tools that can help its HR team screen through thousands of resume/job applications for the best candidates. In 2014, Amazon started working on an AI-powered recruiting software to do just that. This recruiting software Amazon built gave candidates star ratings from 1 to 5. Furthermore, the machine learning models at the heart of the system were trained on 10 years’ worth of resumes submitted to Amazon. However, most of them were from men. As a result, the system penalised applicants who presented themselves to be from a woman and vastly preferred male candidates.
There you have it. With data becoming the most valuable resource to own (more so than oil), it also means that like oil, data and analytics have their dark side. Thus, you should possess the right knowledge, tools and skills before embarking on a data science career with Hackwagon’s Data Science 101. Contact us today to find out more!