Speaker "Yupeng Fu" Details Back



The Architecture of Decoupling Compute and Storage with Open Source Alluxio


As Spark, MapReduce, and many frameworks are being widely deployed at enterprise productions, an efficient, and flexible compute and storage architecture often becomes a hot topic for debate among both IT and LOB practitioners. Although there are good reasons to run compute in a traditional hyper-converge environment as a part of a data lake implementation, the decoupling of storage and computation becomes more and more popular, as O’Reilly points out in its recent 2017 trend post. For example, Alluxio, IBM, Huawei, EMC, Redhat teams joint together to examine real world application examples and provide joint solutions. In this presentation, we will share the decision factors & considerations, such as application workload pattern, data locality, cost of infrastructure, network bandwidth, cloud deployment, etc. Production best practices and solutions will be shared to best utilize CPUs, memory, and different tiers of disaggregated compute and storage systems to build out a multi-tenant high-performance platform that addresses the real world business demand.


Yupeng Fu is a founding member and Senior Architect at Alluxio Inc. He is also a PMC member of the Alluxio open source project. Prior to Alluxio, Yupeng worked at Google and Palantir, building data analytics platforms. Yupeng graduated from Tsinghua University with BS and MS, and had his PhD research in Database at UC San Diego.