
Speaker "Carol Mcdonald" Details Back

-
Name
Carol Mcdonald
-
Company
Mapr
-
Designation
Software Architect
Topic
* Streaming Design Patterns, Revolutionizing Architectures using the Kafka API
* Build a Time Series Application With Apache APIS: Kafka, Spark Streaming and HBase
* Spark GraphX
* Spark Machine Learning
Abstract
Streaming Design Patterns, Revolutionizing Architectures using the Kafka API
Building a robust, responsive, secure data service for healthcare is tricky. For starters, healthcare data lends itself to multiple models:
• Document representation for patient profile view or update
• Graph representation to query relationships between patients, providers, and medications
• Search representation for advanced lookups
Keeping these different systems up to date requires an architecture that can synchronize them in real time as data is updated. Furthermore, meeting audit requirements in Healthcare requires the ability to apply granular cross-datacenter replication policies to data and be able to provide detailed lineage information for each record. This post will describe how stream-first architectures can solve these challenges, and look at how this has been implemented at a Health Information Network provider.
This talk will go over the Kafka API with these design patterns:
• Turning the database upside down
• Event Sourcing , Command Query Responsibity Separation ,Polyglot Persistence
BUILD A TIME SERIES APPLICATION WITH SPARK STREAMING AND HBASE
More and more applications have to store and process time series data, a very good example of this are all the Internet of Things -IoT- applications.
This hands on tutorial will help you get a jump-start on scaling distributed computing by taking an example time series application and coding through different aspects of working with such a dataset. We will cover building an end to end distributed processing pipeline using various distributed stream input sources, Apache Spark, and Apache HBase, to rapidly ingest, process and store large volumes of high speed data.
Participants will use Scala to work on exercises intended to teach them the features of Spark Streaming for processing live data streams ingested from sources like Apache Kafka, sockets or files, and storing the processed data in HBase.
-
Use Apache Spark GraphX to Analyze Flight Data
- Describe GraphX
- Define a property graph
-
Perform operations on graphs
- Lab: Apply graph operations
-
Use Apache Spark MLlib to Predict Flight Delays
- Describe Spark MLlib
- Describe a generic classification workflow
- Describe common terms for supervised learning
- Use a decision tree for classification
- Lab: Create a DecisionTree model to predict flight delays on streaming data
Profile
Carol Mcdonald: Carol is an HBase Hadoop instructor at MapR. She has extensive experience as a software developer and architect, building complex mission-critical applications in the banking, health insurance and telecom industries. Carol has over 15 years of experience working with Java and Java Enterprise technologies in many roles of the software development life cycle, including design, development, and technology evangelism. As a Java Technology Evangelist at Sun Microsystems, Carol traveled worldwide, speaking at Sun Tech Days, JUGs, companies, and conferences. Previously in her career, Carol was a software developer for Shaw Systems, Hoffman La Roche, and Digital Equipment Corporation. Carol holds a BS in Geology from Vanderbilt University, and an MS in Computer Science from the University of Tennessee-Knoxville.