Back

 Industry News Details

 
Walking With AI: How to Spot, Store and Clean the Data You Need Posted on : Jun 16 - 2018

Using a visual interface, this AI startup is making it easier to build deep learning algorithms — as simple as dragging and dropping.

Last August, data science leader Monica Rogati unveiled a new way for entrepreneurs to think about artificial intelligence. Modeled after psychologist Abraham Maslow's five-tier hierarchy of psychological needs, her AI hierarchy of needs has become a conference favorite for illustrating how to incorporate AI into a business.

Despite entrepreneurs' excitement around AI, Rogati's hierarchy makes an uncomfortable point. Few companies are ready to adopt AI. Most are struggling to fulfill fundamental needs, such as reliable data flow and storage. The truth is that data literacy is lacking at most companies hoping to reap the rewards of AI.

You get out what you put in.

To help entrepreneurs understand the importance of high-quality data, our team has come up with what we call the AI uncertainty principle:

The key takeaway? If any of the values on the right fall to zero, so does the value of the AI program. We discussed evaluating business opportunities for AI in a prior Entrepreneur article, so we're now focusing on the second variable: maximizing data quality.

High-quality data is key across all types of machine learning -- supervised, unsupervised and reinforcement learning. For most businesses, supervised learning is the low-hanging fruit because it's about learning from past examples. If the prior examples are irrelevant or low-quality, then guess what? Any insights derived from them will be, too. Someone without any basketball experience can't just join an NBA team -- at least not if he wants to succeed.

While most data scientists prefer the hardcore math of machine learning over the legwork of cleaning data, you can't have the former without the latter. Data science and engineering go hand in hand, and the right machine learning team will have people who can handle both.

Do more with good data; No machine learning initiative will work without high-quality data. To get the good, clean data you need to:

1. Start with instrumentation.

Machine learning initiatives are as diverse as companies  themselves. Think critically about what sort of examples you need to train your algorithm on in order for it to make predictions or recommendations.

For example, an online baby registry we partnered with wanted to project the lifetime value of customers within days of signup. Fortunately for us, it had proactively logged transaction data, including items customers added to their registries, where they were added and when they purchased. Furthermore, the client had logged the entire event stream, rather than just the current state of each registry, to maintain a database record. View More