Back

 Industry News Details

 
Bias in machine learning examples: Policing, banking, COVID-19 Posted on : Aug 26 - 2020

Human bias, missing data, data selection, data confirmation, hidden variables and unexpected crises can contribute to distorted machine learning models, outcomes and insights.

Relying on tainted, inherently biased data to make critical business decisions and formulate strategies is tantamount to building a house of cards. Yet, recognizing and neutralizing bias in machine learning data sets is easier said than done because bias can come in many forms and in various degrees.

Among the more common bias in machine learning examples, human bias can be introduced during the data collection, prepping and cleansing phases, as well as the model building, testing and deployment phases. And there's no shortage of examples.

Some U.S. cities have adopted predictive policing systems to optimize their use of resources. A recent study by New York University's AI Now Institute focused on the use of such systems in Chicago, New Orleans and Maricopa County, Ariz. The report found that biased police practices are reflected in biased training data.

"[N]umerous jurisdictions suffer under ongoing and pervasive police practices replete with unlawful, unethical and biased conduct," the report observed. "This conduct does not just influence the data used to build and maintain predictive systems; it supports a wider culture of suspect police practices and ongoing data manipulation."

A troubling aspect is the feedback loop that has been created. Since police behavior is mirrored in the training data, the predictive systems anticipate that more crime will occur in the very neighborhoods that have been disproportionally targeted in the first place, regardless of the crime rate.

Data bias often results in discrimination -- a huge ethical issue. At the same time, organizations of all types across various industries need to make distinctions between groups of people -- for example, who are the best and worst customers, who is likely or unlikely to pay bills on time, or who is likely or unlikely to commit a crime.

"It's extremely hard to make sure that you have nothing discriminatory in there anymore," said Michael Berthold, CEO of data science platform provider KNIME. "If I take out where you come from, how much you earn, where you live, your education [level] and I don't know what else about you, there's nothing left that allows me to discriminate between you and someone else." View More