Global Big Data Conference

Industry News Details

Using Vertex AI for rapid model prototyping and deployment Posted on : Mar 16 - 2022

Bringing AI models to a production environment is one of the biggest challenges of AI practitioners. Much of the discussions in the AI/ML space revolve around model development. As shown in this diagram from the canonical Google paper “Hidden Technical Debt in Machine Learning Systems”, the bulk of activities, time and expense in building and managing ML systems is not in Model training, but in the myriad ancillary tasks that make an ML system ‘work’.

Another very interesting whitepaper authored by Google is the Practitioners guide to MLOps: A framework for continuous delivery and automation of machine learning. In addition to thought leadership in the AI/ML space, Google Cloud provides a rich set of tools to assist Data Scientists and ML engineers on their journey from prototyping to production.

In this article, our goal is to take best practices provided in the aforementioned resources and bring them to a Google Cloud project. We are going to discuss everything BUT model development, since the focus here is on MLOps and the tooling options for encapsulating your ML workflow into a pipeline. This notebook, available in our Github repository, will walk you through a Vertex Pipeline construction that will take you from model ideation to model deployment.

We'll leave the actual model creation and optimization processes to the experts: BigQuery ML and AutoML Tables. Even better, we'll train two different models and select the one that performs better with our dataset. Before we dive into the pipeline, let’s take a quick look at the tools we’ll rely on for model development:

BigQuery ML (BQML) lets you create and execute machine learning models in BigQuery using standard SQL queries while leveraging BigQuery’s petabyte scale. BigQuery ML democratizes machine learning by letting SQL practitioners build models using existing SQL tools and skills.

AutoML Tables is even more hands-off. You don't need to write any model code in order to reap the benefits from a very advanced machine learning framework. AutoML automatically experiments with many different model architectures and comes up with a state-of-the-art model that addresses your needs.

The pipeline

In order to piece together all the steps of this workflow, we are going to use Vertex Pipelines. Vertex Pipelines lets you orchestrate your ML workflows in a serverless manner. Before Vertex Pipelines can orchestrate your ML workflow, you must describe your workflow as a pipeline. In our example we are going to use the Kubeflow Pipelines v2 SDK.

The beauty of using Vertex AI for your Machine Learning workloads is that the solutions you use throughout your pipeline workflow talk to one another. When you run a pipeline, Vertex AI will automatically track metadata & lineage for all the artifacts generated during the pipeline execution. The Pipeline Metadata can help you answer questions like: View more

Get the