PDE Exam Guide (2026)
The Google Cloud Professional Data Engineer exam has 5 sections: Designing Data Processing Systems (~22%), Ingesting and Processing (~25%), Storing the Data (~20%), Preparing and Using Data for Analysis (~15%), and Maintaining and Automating Data Workloads (~18%). The exam is 2 hours with 50–60 questions. BigQuery is tested in every section.
Quick Facts
- Duration: 2 hours (full) / 1 hour (renewal)
- Questions: 50–60 (full) / 20 (renewal)
- Cost: $200 USD
- Passing score: ~70–80% (Google does not publish the exact threshold)
- Case studies: Flowlogistic & MJTelco (full exam only — not on renewal)
The 5 Exam Sections
Section 1: Designing Data Processing Systems (~22%)
Security, compliance, reliability, flexibility, and data migrations. You'll need to know IAM, Cloud KMS (CMEK, HSM, EKM), VPC Service Controls, disaster recovery, and migration strategies.
- 1.1 Designing for security and compliance (IAM, encryption, PII, data sovereignty)
- 1.2 Designing for reliability and fidelity (data validation, DR, fault tolerance)
- 1.3 Designing for flexibility and portability (multi-cloud, data governance)
- 1.4 Designing data migrations (BigQuery Data Transfer Service, Database Migration Service, Transfer Appliance)
Section 2: Ingesting and Processing the Data (~25%)
The heaviest section. Pipeline planning, building, and operationalizing. Expect Dataflow (Apache Beam), Dataproc, Cloud Data Fusion, Pub/Sub, and windowing/late-arriving data scenarios.
- 2.1 Planning the data pipelines (sources, sinks, transformation logic, encryption)
- 2.2 Building the pipelines (Dataflow, Dataproc, Spark, Kafka, batch vs streaming, AI enrichment)
- 2.3 Deploying and operationalizing (Cloud Composer, Workflows, CI/CD)
Section 3: Storing the Data (~20%)
Storage selection, data warehousing, data lakes, and data platforms. Know when to use BigQuery vs BigLake vs Bigtable vs Spanner vs Cloud SQL vs AlloyDB.
- 3.1 Selecting storage systems (access patterns, managed services, cost, lifecycle)
- 3.2 Planning for a data warehouse (data model, normalization, access patterns)
- 3.3 Using a data lake (discovery, processing, monitoring)
- 3.4 Designing for a data platform (Dataplex, Dataplex Catalog, federated governance)
Section 4: Preparing and Using Data for Analysis (~15%)
Visualization, AI/ML preparation, and data sharing. BigQuery ML, BI Engine, materialized views, Analytics Hub, and Cloud DLP masking.
- 4.1 Preparing data for visualization (BI Engine, materialized views, troubleshooting queries)
- 4.2 Preparing data for AI and ML (BigQuery ML, embeddings, RAG)
- 4.3 Sharing data (Analytics Hub, authorized datasets, reports)
Section 5: Maintaining and Automating Data Workloads (~18%)
Resource optimization, automation, monitoring, and fault tolerance. Cloud Monitoring, BigQuery reservations, Dataproc cluster management, and failover strategies.
- 5.1 Optimizing resources (cost minimization, persistent vs job-based clusters)
- 5.2 Designing automation and repeatability (Cloud Composer DAGs, scheduling)
- 5.3 Organizing workloads (BigQuery Editions, reservations, batch vs interactive)
- 5.4 Monitoring and troubleshooting (observability, billing issues, quotas)
- 5.5 Maintaining awareness of failures (fault tolerance, multi-region, data replication)
Key GCP Services to Know
| Service | Role | Sections |
|---|---|---|
| BigQuery | Serverless data warehouse | All |
| Dataflow | Unified stream + batch ETL (Apache Beam) | §2, §5 |
| Dataproc | Managed Spark/Hadoop | §2, §5 |
| Cloud Composer | Pipeline orchestration (Airflow) | §2, §5 |
| Pub/Sub | Event messaging | §2 |
| BigLake | Unified analytics on multi-cloud data | §3 |
| Dataplex | Data lake/mesh governance | §3 |
| Cloud KMS | Key management (CMEK) | §1 |
| Cloud DLP | PII detection & masking | §1, §4 |
| Analytics Hub | Zero-copy data sharing | §4 |
How to Prepare
The PDE rewards hands-on experience with GCP's data services. Build real pipelines, query real datasets in BigQuery, and orchestrate real workflows with Cloud Composer before sitting the exam.
- Review the official GCP documentation for BigQuery, Dataflow, and Dataplex.
- Follow our 8-Week Study Plan for first-time candidates.
- Renewing? Use the 2-Week Refresher instead.
- Practice with questions verified against current Google Cloud documentation.