Global Big Data Conference

Industry News Details

Big Data Meltdown: How Unclean, Unlabeled, and Poorly Managed Data Dooms AI Posted on : Jun 14 - 2019

We may be living in the fourth industrial age and on cusp of huge advances in automation powered by AI. But according to the latest data, our great future will be less rosy if enterprises don’t start doing something about one thing in particular: the poor state of data.

That’s the gist of several reports to make the rounds recently, as well as interviews with industry experts. Time after time, the lack of clean, well-managed, and labeled data was cited as a major impediment for enterprises getting value out of AI.

Last month, Figure Eight (formerly CrowdFlower) released a study about the state of AI and machine learning. The company, which helps generate training data for customers, found a decided lack of data ready to be used to train machine learning algorithms.

The study found that only 21% of respondents indicated that their data was both ready for AI (that is, it’s organized, accessible, and annotated) and is being used for that purpose. Another 15% report their data is organized, accessible, and annotated, but it’s not being utilized, or it’s being used for other business purposes.

Alegion, which is also in the data labeling business, released its own study that came to a remarkably similar conclusion. The study also found that data quality and labeling issues had negatively impacted nearly four out of five AI and machine learning projects.

“The nascency of enterprise AI has led more than half of the surveyed companies to label their training data internally or build their own data annotation tool,” the company stated. “Unfortunately, 8 out of 10 companies indicate that training AI/ML algorithms is more challenging than they expected, and nearly as many report problems with projects stalling.” View More

Get the