Speaker "John Canny" Details Back



Large-Scale Machine Learning


Machine Learning at the Limit John Canny UC Berkeley How fast can machine learning (ML) and graph algorithms be? In "roofline" design, every kernel is driven toward the limits imposed by CPU, memory, network etc. "Codesign" pairs efficient algorithms with complementary hardware. These methods can lead to dramatic improvements in single node performance: BIDMach is a toolkit for machine learning that uses rooflined design and GPUs to achieve one- to three-orders of magnitude improvements over other toolkits on single machines. These speedups are typically larger than have been reported for *cluster* systems running on hundreds of nodes for common ML tasks. An open challenge is to exploit rooflined single nodes in clusters. The optimal communication rates are right at the network limits, and communication design is itself a rooflined design problem. We describe two solutions that are optimal respectively for small and large models. "Butterfly mixing" is an effcient, simple, and fault-tolerant approach to distributed ML with small models that are replicated on each node. "Kylix" is an optimal approach for large, sparse and possibly distributed models. We can show that Kylix approaches the rooline limits for sparse Allreduce, and empirically holds the record for distributed Pagerank


Bio John Canny is a professor in computer science at UC Berkeley. He is an ACM dissertation award winner and a Packard Fellow. He is currently a Data Science Senior Fellow in Berkeley's new Institute for Data Science and holds a INRIA (France) International Chair. Since 2002, he has been developing and deploying large-scale behavioral modeling systems. He designed and protyped production systems for, Yahoo, Ebay, and Quantcast. He currently works on several applications of data mining for human learning, health and well-being, and applications in the sciences.