Back

Speaker "Emad Barsoum" Details Back

 

Topic

Productizing ML and Deep Learning through ONNX.

Abstract

Deploying a trained ML and DNN model to production and target various hardware’s and latency requirements is challenging. There are a plethora of training frameworks that are used by data scientists some of them have inference runtime and other don’t, most data scientists use multiple frameworks simultaneous which make deployment to production very costly. Furthermore, having multiple frameworks each with different architecture, make it difficult for hardware vendors to optimize their hardware for AI workload. In order to address the challenges of productizing ML and DNN workload, we created ONNX (Open Neural Network Exchange) standard and ONNX Runtime. ONNX is an open standard format that can represent most ML and DNN models, it was started by Microsoft and Facebook, and now include a large number of participants from large companies to individual. To succeed as industry standard, a common format isn’t enough, this is why we build an ecosystem around ONNX. We have converters from most used frameworks to ONNX format and a test coverage for anyone interested to implement their own converter. We have debug and visualization tool. And there are multiple hardware vendors supporting ONNX. In addition, we implemented a fast ONNX runtime that run on multiple hardware targets. ONNX Runtime is a very efficient and flexible runtime created and open sourced by Microsoft. It is designed from the ground up to support multiple backends and the ability to mix and match between backends. You can also add custom OP if it isn’t covered by ONNX standard. ONNX Runtime is a universal runtime for deep learning and machine learning models. You can train with whatever framework you prefer and deploy to ONNX Runtime. ONNX Runtime support graph optimization techniques such as OP fusion, sub-expression elimination, constant folding, graph partition and more. In this talk, we will discuss the architecture of ONNX Runtime, its various backend and how to turn on or off its multiple optimization techniques. And finally, we will discuss the roadmap of ONNX and ONNX Runtime and how you can participate.

Who is this presentation for?
For any product groups that want to move ML and deep learning research to production.

Prerequisite knowledge:
Basic knowledge in ML and deep learning.

What you'll learn?
Open Neural Network Exchange (ONNX) format, why it was created and what problem does it solve.

Profile

Emad Barsoum is an Architect at Microsoft AI Platform team. He leads the deep learning framework effort at Microsoft and help driving Microsoft strategy in AI. Prior to that Emad was Principal SDE and Applied Researcher in the Advance Technology Group at Microsoft Research. He was one of the core developer and researcher behind the Emotion Recognition algorithm used in MS Cognitive Service for both still image and video. Before that, He was one of the main Architects for NUI API on Xbox One, and the tech lead for the depth reconstruction pipeline for Kinect v2. His current research focuses are in computer vision and deep learning algorithms, especially in the area of activity detection/recognition and unsupervised learning. He has given numerous internal and external talks on Deep Learning and Computer Vision. He received his M.S. degree from U.C.Irvine and his doctoral degree from Columbia University.