Global Big Data Conference

Industry News Details

Meta researchers build an AI that learns equally well from visual, written or spoken materials Posted on : Jan 21 - 2022

Advances in the AI realm are constantly coming out, but they tend to be limited to a single domain: For instance, a cool new method for producing synthetic speech isn’t also a way to recognize expressions on human faces. Meta (AKA Facebook) researchers are working on something a little more versatile: an AI that can learn capably on its own whether it does so in spoken, written or visual materials.

The traditional way of training an AI model to correctly interpret something is to give it lots and lots (like millions) of labeled examples. A picture of a cat with the cat part labeled, a conversation with the speakers and words transcribed, etc. But that approach is no longer in vogue as researchers found that it was no longer feasible to manually create databases of the sizes needed to train next-gen AIs. Who wants to label 50 million cat pictures? Okay, a few people probably — but who wants to label 50 million pictures of common fruits and vegetables?

Register: GLOBAL ARTIFICIAL INTELLIGENCE VIRTUAL CONFERENCE Mar 17th to 19 th , 2022

Currently some of the most promising AI systems are what are called self-supervised: models that can work from large quantities of unlabeled data, like books or video of people interacting, and build their own structured understanding of what the rules are of the system. For instance, by reading a thousand books it will learn the relative positions of words and ideas about grammatical structure without anyone telling it what objects or articles or commas are — it got it by drawing inferences from lots of examples.

This feels intuitively more like how people learn, which is part of why researchers like it. But the models still tend to be single-modal, and all the work you do to set up a semi-supervised learning system for speech recognition won’t apply at all to image analysis — they’re simply too different. That’s where Facebook/Meta’s latest research, the catchily named data2vec, comes in.

The idea for data2vec was to build an AI framework that would learn in a more abstract way, meaning that starting from scratch, you could give it books to read or images to scan or speech to sound out, and after a bit of training it would learn any of those things. It’s a bit like starting with a single seed, but depending on what plant food you give it, it grows into an daffodil, pansy or tulip. View more

Get the