Google Sample Question 14 of 15
You are developing a custom image classification model in Python. You plan to run your training application on Vertex AI. Your input dataset contains several hundred thousand small images. You need to determine how to store and access the images for training. You want to maximize data throughput and minimize training time while reducing the amount of additional code. What should you do?
🦉 Explanation by WiseOwl Tutor™ — not endorsed by Google
Filestore is faster than Cloud Storage for accessing files, and serialized records (like TFRecords or WebDataset format) are faster for feeding training pipelines than individual files. Cloud Storage has high metadata overhead when accessing hundreds of thousands of small files individually. Although serialized records on Cloud Storage improve speed, Filestore + serialized records maximizes data throughput.
Ready to practice?
These 15 official sample questions are free to practice on WiseOwlLearns — no account required. Get real-time tutoring from WiseOwl Tutor™ and step-by-step elimination reasoning from Option Analyzer™.