Home > PDE > Exam Guide

PDE Exam Guide (2026)

The Google Cloud Professional Data Engineer exam has 5 sections: Designing Data Processing Systems (~22%), Ingesting and Processing (~25%), Storing the Data (~20%), Preparing and Using Data for Analysis (~15%), and Maintaining and Automating Data Workloads (~18%). The exam is 2 hours with 50–60 questions. BigQuery is tested in every section.

Quick Facts

  • Duration: 2 hours (full) / 1 hour (renewal)
  • Questions: 50–60 (full) / 20 (renewal)
  • Cost: $200 USD
  • Passing score: ~70–80% (Google does not publish the exact threshold)
  • Case studies: Flowlogistic & MJTelco (full exam only — not on renewal)

The 5 Exam Sections

Section 1: Designing Data Processing Systems (~22%)

Security, compliance, reliability, flexibility, and data migrations. You'll need to know IAM, Cloud KMS (CMEK, HSM, EKM), VPC Service Controls, disaster recovery, and migration strategies.

  • 1.1 Designing for security and compliance (IAM, encryption, PII, data sovereignty)
  • 1.2 Designing for reliability and fidelity (data validation, DR, fault tolerance)
  • 1.3 Designing for flexibility and portability (multi-cloud, data governance)
  • 1.4 Designing data migrations (BigQuery Data Transfer Service, Database Migration Service, Transfer Appliance)

Section 2: Ingesting and Processing the Data (~25%)

The heaviest section. Pipeline planning, building, and operationalizing. Expect Dataflow (Apache Beam), Dataproc, Cloud Data Fusion, Pub/Sub, and windowing/late-arriving data scenarios.

  • 2.1 Planning the data pipelines (sources, sinks, transformation logic, encryption)
  • 2.2 Building the pipelines (Dataflow, Dataproc, Spark, Kafka, batch vs streaming, AI enrichment)
  • 2.3 Deploying and operationalizing (Cloud Composer, Workflows, CI/CD)

Section 3: Storing the Data (~20%)

Storage selection, data warehousing, data lakes, and data platforms. Know when to use BigQuery vs BigLake vs Bigtable vs Spanner vs Cloud SQL vs AlloyDB.

  • 3.1 Selecting storage systems (access patterns, managed services, cost, lifecycle)
  • 3.2 Planning for a data warehouse (data model, normalization, access patterns)
  • 3.3 Using a data lake (discovery, processing, monitoring)
  • 3.4 Designing for a data platform (Dataplex, Dataplex Catalog, federated governance)

Section 4: Preparing and Using Data for Analysis (~15%)

Visualization, AI/ML preparation, and data sharing. BigQuery ML, BI Engine, materialized views, Analytics Hub, and Cloud DLP masking.

  • 4.1 Preparing data for visualization (BI Engine, materialized views, troubleshooting queries)
  • 4.2 Preparing data for AI and ML (BigQuery ML, embeddings, RAG)
  • 4.3 Sharing data (Analytics Hub, authorized datasets, reports)

Section 5: Maintaining and Automating Data Workloads (~18%)

Resource optimization, automation, monitoring, and fault tolerance. Cloud Monitoring, BigQuery reservations, Dataproc cluster management, and failover strategies.

  • 5.1 Optimizing resources (cost minimization, persistent vs job-based clusters)
  • 5.2 Designing automation and repeatability (Cloud Composer DAGs, scheduling)
  • 5.3 Organizing workloads (BigQuery Editions, reservations, batch vs interactive)
  • 5.4 Monitoring and troubleshooting (observability, billing issues, quotas)
  • 5.5 Maintaining awareness of failures (fault tolerance, multi-region, data replication)

Key GCP Services to Know

Service Role Sections
BigQueryServerless data warehouseAll
DataflowUnified stream + batch ETL (Apache Beam)§2, §5
DataprocManaged Spark/Hadoop§2, §5
Cloud ComposerPipeline orchestration (Airflow)§2, §5
Pub/SubEvent messaging§2
BigLakeUnified analytics on multi-cloud data§3
DataplexData lake/mesh governance§3
Cloud KMSKey management (CMEK)§1
Cloud DLPPII detection & masking§1, §4
Analytics HubZero-copy data sharing§4

How to Prepare

The PDE rewards hands-on experience with GCP's data services. Build real pipelines, query real datasets in BigQuery, and orchestrate real workflows with Cloud Composer before sitting the exam.

  1. Review the official GCP documentation for BigQuery, Dataflow, and Dataplex.
  2. Follow our 8-Week Study Plan for first-time candidates.
  3. Renewing? Use the 2-Week Refresher instead.
  4. Practice with questions verified against current Google Cloud documentation.