Event Speaker - 5Th Annual Global Big Data Conference

Name

Wangda Tan
Company

Hortonworks
Designation

Staff Software Engineer

Ask a Question to this Speaker

Topic

Hadoop ecosystem boosts Tensorflow and machine learning technologies

Abstract

TensorFlow™ is a popular open source software library for machine intelligence. While TF gives people abilities to describe the latest algorithm for machine learning and deep learning, it is also very important to make TF can be best fitted into the Hadoop ecosystem. In this session, we will talk about how Hadoop ecosystem components boosts TF and other machine learning technologies, including: - Using Hadoop YARN to manage large scale TF services running on a GPU-equipped cluster, and share the same cluster with other tenants and applications. - Using Spark/Hive for large scale data preprocessing. - Using Zeppelin as an interactive interface to orchestrate and visualize the learning workflow. At last, we will use a classic machine learning challenge - online advertising Click Through Rate (CTR) prediction as an example to show how TF works with YARN, Spark and Zeppelin to train a better model in an efficient way.

Profile

Wangda Tan is Product Management Committee (PMC) member of Apache Hadoop and Staff Software Engineer at Hortonworks. His major working field is Hadoop YARN resource scheduler, participated features like node labeling, resource preemption, container resizing etc. Before join Hortonworks, he was working at Pivotal, working on integration OpenMPI/GraphLab with Hadoop YARN. Before that, he was working at Alibaba, participated creating a large scale machine learning, matrix and statistics computation platform using Map-Reduce and MPI.

Speaker "Wangda Tan" Details Back

Name

Wangda Tan

Company

Hortonworks

Designation

Staff Software Engineer

Topic

Abstract

Profile