pde
What Changed on the PDE Exam in 2026 | WiseOwlLearns
The PDE exam added BigLake, AlloyDB, Dataplex, Dataform, RAG/embeddings, and expanded DLP masking. Here's what changed and how to prepare.
The Professional Data Engineer exam has evolved to reflect how GCP’s data platform has matured. The core sections haven’t changed — you still need to design, ingest, store, analyze, and maintain — but the services and patterns tested within each section have shifted toward modern, AI-ready data infrastructure.
Here’s what’s new, what’s different, and what it means for your preparation.
New services on the exam
AlloyDB
AlloyDB is a fully managed, PostgreSQL-compatible database with up to 4x the throughput of standard PostgreSQL. It fills the gap between Cloud SQL (basic relational) and Spanner (globally distributed).
When the exam tests it: Scenarios requiring high-performance relational workloads with PostgreSQL compatibility but without Spanner’s global distribution needs. The key differentiator is performance — if the scenario mentions “high-throughput OLTP with PostgreSQL compatibility,” AlloyDB is usually the answer.
BigLake
BigLake provides a unified storage API for querying data across Cloud Storage, Amazon S3, and Azure Blob Storage from BigQuery, with fine-grained IAM at the table and column level.
When the exam tests it: Multi-cloud analytics scenarios or cases requiring column-level security on external data. BigLake replaces the older pattern of creating external tables without access control.
Dataplex
Dataplex is GCP’s data mesh governance platform. It manages metadata, data quality, and access policies across distributed storage systems.
When the exam tests it: Scenarios requiring governance across multiple teams, projects, or storage systems. The exam specifically tests Dataplex vs custom governance tools (always prefer Dataplex) and Dataplex Catalog for data discovery.
Dataform
Dataform provides SQL-based transformation pipelines natively integrated with BigQuery. It replaces custom ETL scripts for in-warehouse transformations.
When the exam tests it: In-BigQuery transformation scenarios requiring data quality assertions (uniqueness, null checks). The exam trap: Dataplex data quality tasks are for cross-system validation, while Dataform assertions are for in-pipeline validation within BigQuery.
Expanded topics
RAG and embeddings
The exam now tests preparing unstructured data for retrieval-augmented generation. This includes BigQuery ML vector search for embedding generation and nearest-neighbor queries.
What you need to know: How to generate embeddings from text data in BigQuery, store them as vectors, and query them using vector similarity search. This falls under Section 4 (Preparing data for AI and ML).
Cloud DLP masking
Cloud DLP was already on the exam, but its scope has expanded. The exam now specifically tests masking (preserving data utility for analytics) vs removal (data loss) vs encryption (not scalable for organization-wide datasets).
What you need to know: The distinction between these three approaches and when masking is the correct answer (analytics use cases where data utility must be preserved).
Analytics Hub
Analytics Hub enables zero-copy data sharing through authorized datasets created at subscription time. It replaces the older pattern of authorized views (which require manual reauthorization when underlying tables change).
What you need to know: Analytics Hub creates authorized datasets automatically. Authorized views require manual reauthorization. Scheduled query copies increase cost. The exam tests this distinction in Section 4 (Sharing data).
What hasn’t changed
The exam’s core structure and section weights remain the same:
- Section 1: Designing Data Processing Systems (~22%)
- Section 2: Ingesting and Processing the Data (~25%)
- Section 3: Storing the Data (~20%)
- Section 4: Preparing and Using Data for Analysis (~15%)
- Section 5: Maintaining and Automating Data Workloads (~18%)
BigQuery, Dataflow, Dataproc, Cloud Composer, and Pub/Sub are still the most heavily tested services. The new services are additions, not replacements.
The case studies (Flowlogistic and MJTelco) remain on the full exam. The renewal exam, however, eliminates case studies entirely.
How to prepare
If you’re taking the full exam for the first time, the new services add depth to Sections 3 and 4. Budget extra time for BigLake, Dataplex, and AlloyDB. Our 8-Week PDE Study Plan maps these to Weeks 4–5.
If you’re renewing, these new services are the entire point. The renewal exam shifts weight toward Sections 3 and 4, where the new services live. Our 2-Week Refresher focuses exclusively on the delta.
For a detailed service comparison of BigQuery, BigLake, and Dataplex, read our deep-dive.