Back

 Industry News Details

 
Open Data Hub: A Meta Project for AI/ML Work Posted on : Jun 15 - 2021

Open source software is a critical resource in data science today, but integrating the various open source products together can be a complex task. This is what drove Red Hat to develop Open Data Hub, which brings over two dozen commonly used tools together into a single cohesive framework that simplifies access to AI and machine learning capabilities for data professionals.

Open Data Hub (ODH) originated about five years ago as an internal Red Hat project to simply store large amounts of data so that it was accessible for data scientists to build models, according to Will McGrath, a senior principal product marketing manager at Red Hat. In Red Hat’s case, the engineers chose Ceph, the S3-compatbile object storage system.

After getting a handle on the storage aspect of the data, Red Hat’s team then brought a handful of tools into the equation, starting with Jupyter, Apache Spark, and TensorFlow. The system supported internal Red Hat use cases, such as analyzing log files from customer complaints or for searching the internal knowledgebase, McGrath says.

Eventually, word of ODH’s existence leaked out to a handful of Red Hat customers, who expressed an interest in trying out the software, he says. In 2018, the company made the decision to turn ODH into a full-fledged open source project that could be downloaded and used by the general public, as well as contributed to from the open source community. You can see a short history of the product in this Red Hat video. View More