Back

 Industry News Details

 
Samsung Invests in Cray Supercomputer for Deep Learning Initiatives Posted on : Nov 16 - 2017

One of the reasons this year’s Supercomputing Conference (SC) is nearing attendance records has far less to do with traditional scientific HPC and much more to do with growing interest in deep learning and machine learning.

Since the supercomputing set has pioneered many of the hardware advances required for AI (and some software and programming techniques as well), it is no surprise new interest from outside HPC is filtering in.

On the subject of pioneering HPC efforts, one of the industry’s longest-standing companies, supercomputer maker Cray, is slowly but surely beginning to reap the benefits of the need for this HPC experience for AI.

Not long ago we talked to Cray’s CTO, interconnect pioneer, Steve Scott about how traditional supercomputing can bend to the needs of deep learning at scale in a way that is affordable and usable for those less familiar with HPC systems. Since oftentimes GPU are the center of these systems for training, Scott talked about integrating these into a dense, tightly-coupled system like the company’s Storm line—a product set that is designed for HPC but does not have the Aries interconnect that powers performance for its top-of-the-line XC line of supercomputers.

It was this very integration of GPUs and overall dense performance that pushed Samsung to Cray—a notable movement at a time when most OEMs have wares on the market with the same basic components (Pascal GPUs, choice of CPU, etc.). In a statement, Samsung says that the CS Storm supercomputer will be used for “running artificial intelligence and deep learning applications at scale with very large, complex datasets” for the Samsung Strategy and Innovation Center (SSIC) with focus on connected device and vehicles.

Samsung invested in three CS Storm 500NX cabinets that have up to 8 Nvidia P100 (Pascal) accelerators per node. The companies are not revealing how many GPUs per node they selected, however, if we do the math on the top-end capabilities with 8 Pascal GPUs per node (14 nodes per rack) the peak performance of this deep learning cluster is close to 2 petaflops (even if that metric matters less for reduced or mixed precision workloads like these). Either way, this is powerful for a test cluster, which indicates Samsung has done its homework against rival systems for AI, including possibly the Pascal-based DGX appliance, among other OEM creations with similar feeds and speeds. View More