Deploying machine learning in production with high predictability

“Hello, World!” I am Balu Nair, Account Director, Google Cloud Services at SpringML. My career began as a Mainframe Developer over two decades ago, where I learned the front end, core programming, database, and even wrote code that compiled and ran the main program. Then came the era of two-tier and three-tier architectures where programmers would concentrate on the front-end or back-end, and, now, AI / ML / DL where machines are taught to think like humans.

Recently, I got a chance to attend a session on Accelerating the deployment of predictable ML in production, at the Google Cloud Applied ML Summit, which gave a good insight into the challenges industries face with deploying ML models in production and how Google is working to overcome them. The session discusses the significance of AI / ML in digitization and the challenges the data scientists face while adopting ML and AI.

Why is accelerating the deployment of ML in production with high predictability important?

To understand this, we need to take a step back to examine what Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) are and how it has revolutionized our lives in the last few years. DL is a subset of ML algorithms and ML is a subset of Artificial Intelligence.

Let’s understand this using a simple example:

A young student in a math class will learn the basic concepts of addition using a few examples –this is similar to deep learning. As days pass, the young student will be able to add any number using the same concepts that he learned from the examples without being explicitly taught – this is similar to machine learning.

Further, as he gets exposed to more numbers, he would use his intelligence to add numbers faster and more efficiently – this is similar to Artificial Intelligence.

My next reference is that AI should be called ‘Acquired Intelligence’ instead of Artificial Intelligence because the intelligence is acquired by analyzing large amounts of diverse data. Exposure to more varied data would allow machines to refine their outcome and imitate human intelligence. By now, you have gained a better understanding of the significance of data.

Further, according to Google Cloud, the following four areas should be considered:

  • Freedom of choice
  • Meeting users where they are
  • Data and AI
  • Manage and maintain

The guiding principle for Google Cloud is to help customers get rid of every barrier in the way of deploying useful and predictable machine learning at scale. Google Cloud launched the Vertex AI-managed ML platform in May 2021 to accelerate the deployment and maintenance of ML models. This platform fast-tracked the development of ML models by 5% with 80% fewer lines of code. The number of ML predictions increased by 2.5 times with Vertex AI and BigQuery. Customers see its value as 25 times increase in active customers using Vertex AI workbench in the last 6 months.

Let’s look into the areas of improvement and what Google Cloud has done to overcome the challenge.

  1. Freedom of choice: Data scientists should have the freedom of choice to mix and match different ML components and different kinds of deployment instances without giving priority to the cost aspects of ML frameworks.Google Cloud has partnered with NVIDIA to develop a one-click deployment of NVIDIA AI software solution into the Vertex AI workbench. As a result, deploying the code to Jupyter notebooks is simplified from 12 complicated steps to one simple click.Introduced Vertex AI training reduction server that supports TensorFlow and PyTorch for advanced bandwidth reduction and latency reduction within multi-node NVIDIA GPU clusters resulting in shorter training times. This means that the training job can be iterated more often within the same deployment window or can be completed in less time.
  2. Meeting users where they are – It is important to understand the customer’s preference in developing the ML models. Dictating the custom model over pre-trained APIs or vice versa should not be the gating criteria for users to participate in adopting ML in an organization.Google Cloud’s AutoML is the top automated ML framework in the market. Customers need more control. Preview of VAI Tabular Workflow is introduced. This is a fully Managed AutoML pipeline that comes with a variety of proprietary algorithms which let users manage multiple levels into AutoML to control the extent of automation of network structure, model selection, etc. It is scalable without losing accuracy or stability
  3. Data and AI – Data and AI / ML are two sides of a coin. You need data to build predictable ML models and you need good predictable ML models to refine the data. So this is a catch 22 situation.

Let me explain using an example:

A couple of years ago, a robot vacuum cleaner was launched with built-in AI/ML. The robot vacuum cleaner circumvented the cleaning path when it detected a sock on the floor. The following day, even though the sock was not on the floor, it didn’t clean that spot. It took a while to gather data to understand the difference between the sock and a piece of furniture, contributing to fewer sales of the product. This explains that Data and AI / ML are mutually complementing factors and one cannot survive without the other.

Google Cloud wants to help developers be efficient no matter what form of data comes to the AI engine. Integrating serverless Spark in the Vertex AI workbench made it possible. Google Cloud has partnered with Neo4j to unlock the power of graph-based ML models.

The Vertex AI platform allows data scientists to work with raw data in the form of connected entities pointing to each other in a complex web and convert it into structured data with features required for ML models.

For unstructured data such as images, audio, and video, Google Cloud has partnered with Labelbox. Data Scientists can now work efficiently on unstructured data labels and directly use them in ML algorithms. However, this integration is available only in Google Cloud.

Manage and Maintain ML models over time with ease. Data scientists should not be spending time playing the role of Infrastructure engineers or Operational engineers to maintain the model accuracy, scaling, disaster resistance, or security. There should be technology to support subsequent deployments.

Explainable AI solutions are the best way to ensure that ML systems remain safe and healthy over time. Vertex AI example-based explanations is a perfect diagnostic tool that helps understand mislabeled examples and anomalies in training data.

Back to the session, Accelerating the deployment of ML in production with high predictability as a game changer is the solution Vertex AI which is a mutual benefit for the customer and Google Cloud. In the session, you got a chance to hear from the leaders at Uber and Ford on how their organizations have transformed by adopting ML / AI.

Uber overall makes billions of real-time decisions globally using their ML platform, Michaelangelo. Various features within their application, including the Estimated Time of Arrival, mask verification, search and discovery experience, restaurant, and dish recommendations are powered by ML. Uber started using traditional ML techniques 7 to 8 years years ago and since Deep Learning technology matured, Uber’s AI team pushed Deep Learning within Uber. Deep Learning fits their use case considering their scale of operation and the enormous amount of data. Uber developed techniques in-house and partnered with Google Cloud to incorporate Vertex AI, AutoML, and tablet capabilities into the Uber platform.

Uber overcomes this challenge of unstructured data:

Depending on the use case, data can be unbalanced and Uber has to input missing data. With the goal of reducing the time to shift to the next iteration of the model, Uber is collaborating with Google Cloud to incorporate Vertex AI and AutoML into the training phase of the platform. This helps improve model performance and this is also being used to benchmark Uber’s in-house development. This is also mutually benefiting Google to solve dataset size limitation issues, moving from a single instance to multiple instances, feature implementations, and customization.

How Ford is collaborating with Google Cloud:

Ford is using AI in every step from design to manufacturing, and all their products. They use AI from real-time updating maps to avoid construction work to estimating the range of batteries based on weather changes to keep a constant relationship with the customers. Ford is using deep learning to replace traditional physics-based computational fluid dynamic models in virtual wind tunnel tests. This helps Ford produce good aerodynamic products, which is very significant for electric vehicles.

Vertex AI is an integral part of Ford ML’s development platform to scale AI for non-software experts. Vertex AI pipelines develop reusable modular ML workflow which enables multiple people to work simultaneously on the same model. AutoML is used for transcribing speech and basic object detection. The interesting part is that Data Scientists do not have to master software and infrastructure skills such as Terraform, managing Kubernetes clusters, or building APIs to productionalize their models. This helped to grow Ford’s community of AI builders.

ML / AI is the future for business success

Today, everything from setting an alarm, to your robot vacuum cleaner, to airplanes, to self-driving cars use ML and AI. But, are we getting the full benefit of these modern-day technologies? Gartner states that only 10% of organizations have 50% of the workforce trained with ML skills. In that, only 53% of developed ML models make it to production. Hence it is fair to accept the fact that ML is still in its infancy and has a significant skills gap, with enormous possibilities yet to be explored.

Thought Leadership