Google Sample Question 15 of 15
Your company manages an ecommerce website. You developed an ML model that recommends additional products to users in near real time based on items currently in the user’s cart. The workflow will include the following processes: 1. The website will send a Pub/Sub message with the relevant data, and then receive a message with the prediction from Pub/Sub. 2. Predictions will be stored in BigQuery. 3. The model will be stored in a Cloud Storage bucket and will be updated frequently. You want to minimize prediction latency and the effort required to update the model. How should you reconfigure the architecture?
🦉 Explanation by WiseOwl Tutor™ — not endorsed by Google
The RunInference API with a locally loaded model minimizes the prediction latency and makes model updates seamless by watching for new files using WatchFilePattern. Cloud Functions will run into limitations based on request rate and model size. Exposing the model as a Vertex AI endpoint and calling it from Dataflow adds to the total latency. Provisioning Vertex AI Pipelines is slow and adds significant latency, making it unsuitable for near-real-time cart predictions.
Ready to practice?
These 15 official sample questions are free to practice on WiseOwlLearns — no account required. Get real-time tutoring from WiseOwl Tutor™ and step-by-step elimination reasoning from Option Analyzer™.