Back

Speaker "Paige Roberts" Details Back

 

Topic

1. Unifying Analytics - Making Production Data Accessible for Data Science at Scale

2. Workshop topic is “In-Database Machine Learning with Jupyter.”

Abstract

The data warehouse has been an analytics workhorse for decades. Unprecedented volumes of data, new types of data, and the need for advanced analyses like machine learning brought on the age of the data lake. Now, many companies have a data lake, a data warehouse, or a mishmash of both, possibly combined with a mandate to go to the cloud. The end result can be a sprawling mess, a lot of duplicated effort, a lot of missed opportunities, a lot of data science projects that never made it into production, and a lot of financial investment without return.
 
Technical and spiritual unification of the two opposed camps can make a powerful impact on the effectiveness of analytics for the business overall. Over time, different organizations with massive workloads like IoT have found practical ways to bridge the artificial gap between these two data management strategies.
 
Look under the hood at how companies have gotten high scale machine learning projects working, and how their data architectures have changed over time. Learn about new architectures that successfully let data scientists use clean production data, and still use the many massive, highly variant data sets they need at the scale today’s use cases require.
 
Learn:
 
How successful data architectures look from companies like Philips, Anritsu, Uber, …
Eliminate duplication of effort between data science and BI data engineering teams
Avoid some of the traps that have caused so many big data analytics implementations to fail
How to get AI and ML projects into production where they have real impact, without bogging down essential BI
Study analytics architectures that work, why and how they work, and where they’re going from here

Profile

In 23 years in the data management industry, Paige Roberts has worked as an engineer, a trainer, a support technician, a technical writer, a marketer, a product manager, and a consultant. She has built data engineering pipelines and architectures, documented and tested open source analytics implementations, spun up Hadoop clusters, picked the brains of stars in data analytics and engineering, worked with a lot of different industries, and questioned a lot of assumptions. Now, she promotes understanding of Vertica, MPP data processing, open source, high scale data engineering, and how the analytics revolution is changing the world.