Speaker "Zhenxiao Luo" Details Back



Druid at Twitter: How Big Data Goes Real Time


An important characteristic of Twitter is its real-time nature. Consequently, many of Twitter’s projects need real time analytics as a platform service. During recent years, a number of teams are adopting Druid as real time analytics engine.
In this talk, we will talk about Druid at Twitter, which provides sub-second query performance, real time ingestion, and easy of use for user. We start with Twitter’s big data architecture, followed by a detail introduction of Apache Druid. We will focus on a number of Druid features developed by Twitter, including Native Indexing support for Hadoop data, LDAP authentication and authorization, data scrubbing, and the Presto Druid Connector, which provides complex SQL functionality on Druid data, and enables joining data between Hadoop, Druid, Cassandra, Elasticsearch, and any data storage solutions, without data copy. Production experiences will be shared.


Zhenxiao Luo is Sr. Staff Engineer, leading Interactive Query Engines team at Twitter, where he focuses on Druid, Presto, Spark, and Hive. Before joining Twitter, Zhenxiao was running Interactive Analytics team at Uber. He has big data experience at Netflix, Facebook, Cloudera, and Vertica. Zhenxiao is PrestoDB committer. He holds a master’s degree from the University of Wisconsin-Madison and a bachelor’s degree from Fudan University.