Google today launched AI Platform Prediction in general availability, a service that lets developers prep, build, run, and share machine learning models in the cloud. It’s based on a Google Kubernetes Engine backend and features an architecture designed for high reliability, flexibility, and low overhead latency.
IDC predicts that worldwide spending on cognitive and AI systems will reach $77.6 billion in 2022, up from $24 billion in revenue last year. Gartner agrees: In a recent survey of executives from thousands of businesses worldwide, it found that AI implementation grew a whopping 270% in the past four years and 37% in the past year alone. With AI Platform Prediction, Google adds yet another managed AI service to its portfolio, beating back competitors like Amazon, Microsoft, and IBM.
AI Platform Prediction ostensibly makes it easier to deploy models trained using frameworks like XGBoost and scikit, courtesy of an engine that selects compatible cloud hardware (e.g., AI accelerator chips) automatically. On supported virtual machines, it shows metrics like graphics card, processor, RAM, and network usage, as well as things like model replica count over time. And on the security side, AI Platform Prediction ships with tools that allow users to define parameters and deploy models that only have access to resources and services within a defined network perimeter.
Beyond this, AI Platform Prediction provides information about model predictions and a visualization tool to help elucidate those predictions. Moreover, it continuously evaluates live models based on ground-truth labeling of requests sent to the model, providing an opportunity to improve performance through retraining.
All of AI Platform Prediction’s features are available in a fully managed, cluster-less environment with dedicated enterprise support. Google optionally handles quota management to protect models from overload if clients send too much traffic.
Among other customers, Waze — which Google owns — is using AI Platform Prediction to power Waze Carpool, a ride-sharing service for commuters. Waze senior data scientist Philippe Adjiman says that in just a few weeks, Waze was able to deploy a model in production that matches drivers with riders heading in the same direction.
“AI Platform Prediction’s recent general availability release of support for GPUs and multiple high-memory and high-compute instance types will make it easy for us to deploy more sophisticated models in a cost-effective way,” Adjiman wrote in a blog post. “Multiple data science teams and projects (ads, future drive predictions, ETA modeling) at Waze are already using or started exploring other existing (or upcoming) components of the AI Platform. More on that in future posts.”