Back

Speaker "Maximo Gurmendez" Details Back

 

Topic

Productizing Machine Learning over Big Data with AWS tools. 

Abstract

In this presentation we will focus on the problem of serving machine learning models that are trained over very large data sets (terabytes and above). In particular, we will show how some AWS tools, including Apache Spark on EMR and SageMaker, can aid such process. We will use notebooks to illustrate the ideas with real business use cases and we will share some success stories behind the development of smart data products along with the lessons learned. Throughout the presentation we will try to address some of these questions: 
 
·  What do we do when our training takes too long, or is too expensive?
 
·  Are "deployable notebooks" a good idea?
 
·  How can we integrate big data tools such as EMR/Spark with ML services such as SageMaker?
 
·  Why are model serving endpoints not enough?
 

Profile

Maximo holds a master's degree in computer science/AI from Northeastern University, where he attended as a Fulbright Scholar. As Chief Engineer of Montevideo Labs he leads data science engineering projects for complex systems in large US companies. He is an expert in big data technologies and co-author of the popular book "Mastering Machine Learning on AWS" . Additionally, Maximo is a computer science professor at the University of Montevideo and is director of its data science for business program.