Speaker "Chris Fregly" Details Back



Using AWS SageMaker, Kubernetes, and PipelineAI for High Performance, Hybrid-Cloud Distributed TensorFlow Model Training and Serving with GPUs.


In this talk, I will demonstrate how to train, optimize, and serve distributed machine learning models across various environments including the following:

1) Local Laptop

2) Kubernetes Cluster (Running Anywhere)

3) AWS's New SageMaker Service

I'll also present some post-training model-optimization techniques to improve model serving performance for TensorFlow running on GPUs. These techniques include 16-bit model training, neural network layer fusing, and 8-bit weight quantization.

Lastly, I'll discuss alternate runtimes for TensorFlow on GPUs including and TensorFlow Lite and Nvidia's TensorRT.


Chris Fregly is Founder and Research Engineer at PipelineAI (, a real-time Machine Learning and Artificial Intelligence Startup based in San Francisco.  

He is also an Apache Spark Contributor, a Netflix Open Source Committer, founder of the Global Advanced Spark and TensorFlow Meetup, author of the O’Reilly Training and Video Series titled, "High Performance TensorFlow in Production."

Previously, Chris was a Distributed Systems Engineer at Netflix, a Data Solutions Engineer at Databricks, and a Founding Member and Principal Engineer at the IBM Spark Technology Center in San Francisco.