Back

 Industry News Details

 
Better Machine Learning Demands Better Data Labeling Posted on : Dec 05 - 2021

Money can’t buy you happiness (although you can reportedly lease it for a while). It definitely cannot buy you love. And the rumor is money also cannot buy you large troves of labeled data that are ready to be plugged into your particular AI use case, much to the chagrin of former Apple product manager Ivan Lee.

“I spent hundreds of millions of dollars at Apple gathering labeled data,” Lee said. “And even with its resources, we were still using spreadsheets.”

It wasn’t much different at Yahoo. There, Lee helped the company develop the sorts of AI applications that one might expect of a Web giant. But getting the data labeled in the manner required to train the AI was, again, not a pretty sight.

“I’ve been a product manager for AI for the past decade,” the Stanford graduate told in a recent interview. “What I recognized across all these companies was AI is very powerful. But in order to make it happen, behind the scenes, how the sausage was made was we had to get a lot of training data.”

Armed with this insight, Lee founded Datasaur to develop software to automate the data labeling process. Of course, data labeling is an inherently human endeavor (at least, in the beginning of an AI project, although towards the middle or the end of a project, machine learning itself can be used to automatically label data, and synthetic data can also be generated).

Lee’s main goal with the Datasaur software was to streamline the interaction of human data labelers and to guide them through the process of creating the highest quality training data at the lowest cost. Since it targets power users who label data all day, it has created function keys that accelerate the process, among other capabilities befitting a dedicated data labeling system. View More