Speaker "Alok Aggarwal" Details Back
-
Name
Alok Aggarwal
-
Company
Scry Analytics
-
Designation
CEO
Topic
Building End-To-End Solutions for Big Data Science Problems
Abstract
Substantial amount of time and money that is being spent by startup companies in Big Data Science area is in building databases (e.g., Tokutek) or making programming faster (e.g., Splice Machine) or improving underlying algorithms (e.g., H2O or Oxdata). Indeed, all this is extremely important since it provides the required “plumbing” for development platforms in this area and for growing the overall ecosystem. However, the eventual aim of Big Data Science and Analytics is to improve key performance indicators for an organization, which include improving cash-flow, timeliness, quality, customer experience, and parameters related to compliance and risk.
In this technical session, we discuss as to how to build end-to-end solutions in Big Data Science. In this regard, we take an example of improving the overall efficiency for a typical paper mill (all the way from “taking in paper pulp” to “providing finished paper” to the end-clients), and we discuss step by step as to how such a solution may be built. We will also go into detail regarding the four steps that are required to design a typical solution, wherein the first step involves creating a data cleansing and munging module (using Extract-Transform-Load and machine learning algorithms) for harmonizing data, the second step consists of building math, statistics and computer science algorithms for providing descriptive, predictive and prescriptive analytics modules, the third step consists of connecting these modules together, and the fourth and final step consists of ensuring that the entire solution runs on an appropriate operating system and so that bugs can be fixed quickly. This session assumes a basic knowledge of Open Source Algorithms and Systems such as Linux, Java, Hadoop, Spark, Mahout, and Storm. Understanding of math, computer science and statistics algorithms will also be helpful.