Google Sample Question 14 of 15

You are developing a custom image classification model in Python. You plan to run your training application on Vertex AI. Your input dataset contains several hundred thousand small images. You need to determine how to store and access the images for training. You want to maximize data throughput and minimize training time while reducing the amount of additional code. What should you do?

Source: Google Cloud OFFICIAL

Official sample question published by Google Cloud. WiseOwlLearns is not affiliated with Google LLC.

All explanations and Option Analyzer™ content are generated by WiseOwlLearns and are not endorsed by Google Cloud.

A Store image files in Cloud Storage, and access them directly.
B Store image files in Cloud Storage, and access them by using serialized records.
C Store image files in Cloud Filestore, and access them by using serialized records. ✓ Correct
D Store image files in Cloud Filestore, and access them directly by using an NFS mount point.
🦉 Explanation by WiseOwl Tutor™ — not endorsed by Google

Filestore is faster than Cloud Storage for accessing files, and serialized records (like TFRecords or WebDataset format) are faster for feeding training pipelines than individual files. Cloud Storage has high metadata overhead when accessing hundreds of thousands of small files individually. Although serialized records on Cloud Storage improve speed, Filestore + serialized records maximizes data throughput.

Ready to practice?

These 15 official sample questions are free to practice on WiseOwlLearns — no account required. Get real-time tutoring from WiseOwl Tutor™ and step-by-step elimination reasoning from Option Analyzer™.