Back

 Industry News Details

 
Big data throws big biases into machine learning data sets Posted on : Feb 16 - 2018

AI holds massive potential for good, but it also amplifies negative outcomes if data scientists don't recognize data biases and correct them in machine learning data sets.

Say you're training an image recognition system to identify U.S. presidents. The historical data reveals a pattern of males, so the algorithm concludes that only men are presidents. It won't recognize a female in that role, even though it's a probable outcome in future elections.

This latent bias is one of the many types of biases that challenge data scientists today. If the machine learning data set they use in an AI project isn't neutral -- and it's safe to say almost no data is -- the outcomes can actually amplify bias and discrimination that's present in the machine learning data set.

Visual recognition technologies that label images require vast amounts of labeled data, which largely comes from the web. You can imagine the dangers in that -- and researchers at the University of Washington and University of Virginia confirmed one poignant example of gender bias in a recent report.

They found that when a visual semantic role labeling system sees a spatula, it labels the utensil as a cooking tool, but it's also likely to refer to the person in the kitchen holding that tool as a woman -- even when the training data image is a man. Without properly quantifying and reducing this type of correlation, machine learning tools will magnify stereotypes, the researchers concluded.

So while AI projects hold immense benefits, companies evaluating AI initiatives must also understand the dangers associated with creating systems that deliver biased results. In its November 2017 report, "Predicts 2018: Artificial Intelligence," Gartner warned that data bias can have devastating and highly public impacts on AI outcomes. They cautioned executives to ensure accountability and transparency in their methodologies. View More