Global Big Data Conference

Industry News Details

Understanding the real capabilities of unsupervised machine learning Posted on : Dec 07 - 2017

“Hey, Siri. What is the capital of New York?” We all know what happens next — Siri provides the answer. How Siri knows the correct answer is not a mystery (we have the internet to thank for that), but what is more interesting is the fact that Siri is able to understand the question at all.

Siri can understand and respond to human speech for the same reason Facebook knows which friend to tag in a photo before you even type their name. This “knowledge” is a technology called machine learning.

Trained machine learning

There are two types of machine learning: trained and untrained. Most of us experience trained, or supervised, machine learning in our everyday lives, from weather forecasts and sports outcome predictions to Siri and Facebook. These examples are considered trained machine learning because they require input and output data.

Trained machine learning forms either a classification or a regression. A classification is when the machine predicts discrete responses, such as whether an email is spam or legitimate. After enough instances of manual distinction, the machine begins to learn. It uses the information it has collected over time (input data) to determine the outcome, which must fall among the output data.

A regression is when the machine predicts continuous responses. We see this form of trained machine learning through stock market predictions. Imagine you were asked to determine the missing number in this sequence: 3-9, 4-16, 5-25, 8-? — what would you say? Your answer would likely be 64, and if so, you would be correct. It’s safe to assume you came to that conclusion by studying the sequence and recognizing that each number was followed by its perfect square. You determined the outcome by studying a sequence and identifying a pattern.

In the case of both classification and regression, the machine uses input data to determine the output, which must fall among the provided output data.

For a more relatable example, let’s look at the way Facebook suggests users to tag in your photos. Facebook does not know what you nor your friends look like; it simply collects data from previously tagged photos and “learns,” by repetition, how to identify each person. The more photos someone is tagged in, the more likely Facebook is to make an accurate suggestion. The more input data a machine is fed, the more accurate the outcomes it can deliver. View More

Get the