
Speaker "Stephen O'sullivan" Details Back

-
Name
Stephen O'sullivan
-
Company
Data Whisperers
-
Designation
Founder
Topic
How I Learned to Stop Worrying and Love the Data Scientists
Abstract
Brief Abstract: So how much data engineering should a Data Scientist know? For a Data Scientist to get to the fun part of their job, they normally have to do a bit of data engineering. Like on boarding data. Do a little bit of “wrangling”. Before they get to the fun part part - The Data Science! In most cases this is 50%-80% of the time.
Detailed Abstract: Then comes the handing it over to the Data Engineering team to put it into production (of course via dev, test, and QA). This is when a “little bit” of contention happen. As in most cases the Data Engineering team will have to do “some” modification/re-write/Head shaking/Hand wringing to get the code to be production ready and meet the SLA’s defined by the business. As there is a disconnect in how Data Scientists and Data Engineers develop code / models (I get a front row seat to this all the time). In this talk I’ll take the Data Scientist on a journey. From on-boarding data, and how different data/object stores can help; Understanding and choosing the right data format for the data assets; Explore some different query engines, and some basic query tuning for each; Explain how a distributed streaming platform works, and how you can take advantage of it; Lastly cover some good coding practices. This will give the Data Scientist new skills to help them be more productive, so that can get to the fun part faster! Plus reduce the contention with the Data Engineering team, and make them say - “How I Learned to Stop Worrying and Love the Data Scientists”!