Back

Speaker "Arush Kharbanda" Details Back

 

Topic

Implementing a High Performance Analytics with Apache Spark Streaming

Abstract

In this presentation we will showcase the ways to implement a high performance, fault tolerant, near real time streaming system using Apache Spark Streaming. Apache Spark Streaming provides a platform to implement a near real time Streaming System, but scaling a system using Apache Spark provides many challenges. While implementing Apache Spark Streaming for a production environment one comes across various system bottlenecks and system failure issues. This talk addresses ways to overcome network, memory and processing bottlenecks to achieve optimal performance through an Ad network customer case study, where we process 30 TB of data per day .

Profile

Arush works as a Technical Team Lead in Sigmoid Analytics. Sigmoid Analytics provides an end to end framework using Apache Spark to process large amounts of data in realtime. It is an active contributor to the Apache PIG community through its project PIG on Apache Spark . He has 2 plus years of experience in BigData systems using Apache Spark.