Speaker "Joy Chakraborty" Details Back
-
Name
Joy Chakraborty
-
Company
Bloomberg L.P.
-
Designation
Big Data Architect
Topic
To Secure and Scale-Out Data Science Notebook for Spark using Docker and Kubernetes
Abstract
This presentation will provide technical design and development insights in order to set up a Kerberize (secured) JupyterHub notebook for Spark running in a Yarn cluster. Joy will show how Bloomberg set up the Kerberos-based notebook for Data Science community using Docker and Kubernetes by integrating JupyterHub, Sparkmagic, and Levy. Sparkmagic provides the Spark kernel for R, Scala and Python. Livy is one of the most promising open source software to allow to submit Spark jobs over http-based REST interfaces. This presentation will highlight the capabilities of Jupyterhub, Sparkmagic and Livy, along with the gap and development required in order to make the notebook to work with Kerberized HDFS/Yarn cluster running Hive, Spark and other services. Docker and Kubernetes strategies the scale-out design and minimizes the complex integration challenges involving networking and isolation which is essential for such project that will be covered in this presentation. No prior knowledge of any of these technologies is required in order to understand this presentation.