Speaker "Sanhita Sarkar" Details Back



Building Intelligent AI Data Pipelines for Improved Data Center Economics


Artificial intelligence (AI) requires processing power and adequate storage while executing various deep learning (DL) frameworks. The training and deployment stages of a DL system have different data and processing needs. On one hand, the large volumes of data during training demands systems with support for massive storage capacity, multiple data formats and protocols for processing dispersed data sets, and sharing of data and models across applications. On the other hand, AI deployment for delivering inference on incoming data requires fast access to the data to meet the demand for AI responsiveness for applications.
The needs for processing and storage vary for the different phases of an AI data pipeline comprising data ingestion, model training, and model serving for inference, followed by search and data discovery. Disaggregation of GPUs, flash,and high capacity storage can enable the delivery of rapid response times and scaling requirements of an AI data pipeline, without compromising on data persistence, quality, durability and cost.
Who is this presentation for?
Business Leaders, engineering managers, data center architects/managers, solution developers and consultants, VARs, OEMs and system integrators, who are seeking to get more value from existing IoT and AI deployments and operate more efficiently, or planning to implement and spend budget on a new AI project in their data centers.
Prerequisite knowledge:
Concepts of how data is managed end-to-end within enterprises.
What you'll learn?
Attendees will get a know-how of the challenges involved to independently scale the data infrastructure for AI systems to serve their changing needs. 
They will learn about disaggregating storage and compute, and sharing flash storage across multiple components in the AI data pipeline for various cost benefits. 
They will learn how and when to leverage flash and high capacity storage to meet response times and scaling requirements of AI use cases, without compromising on long-term data persistence, durability and cost.



Sanhita Sarkar is a Global Director, Software Development, Data Center Architectures & Ecosystems, at Western Digital, where she focuses on software design and development of analytics features and solutions spanning edge, data center, data lake, and cloud. She has expertise in key vertical markets such as the Industrial Internet of Things (IIoT), Defense and Intelligence, Financial Services, Genomics, and Healthcare. Sanhita previously worked at Teradata, SGI, Oracle, and a few startups. She was responsible for overseeing design, development, and delivery of optimized software and solutions involving large memory, scale-up, and scale-out systems. Sanhita has authored multiple patents, published several papers, and has spoken at several conferences and meetups. She received her Ph.D. in Electrical Engineering and Computer Science from the University of Minnesota, Minneapolis.