Speaker "Sushanth Sowmyan" Details Back
-
Name
Sushanth Sowmyan
-
Company
Hortonworks Inc.
-
Designation
Software Engineer
Topic
Enabling cross-datastore replication with Hive
Abstract
Replication is an important feature needed in most mature database systems. It is used to enable disaster recovery as well as for cluster load balancing, and sometimes for cluster isolation for security reasons. We go over the mechanism being developed, wherein, rather than baking-in replication on to Hive itself, we have chosen instead, to add an event-based replication capability to Hive, so as to allow other tools such as Falcon to plug in to implement replication. This allows admin/user-facing tools like Falcon to have fine control on what and how they replicate as defined by their users , while leaving the delta, data and metadata management to hive itself. This allows for "loose, but powerful" replication of data and metadata. This approach can, in theory, also enable potential third party data management/movement solutions in the future that want to integrate their own warehouse or replication systems with hive through this mechanism.