A company wants to build a dimension table in an Amazon S3 bucket.

Amazon Web Services Data-Engineer-Associate Full Course Access

Amazon Web Services Data-Engineer-Associate View All Questions

Amazon Web Services Data-Engineer-Associate Question Answer

A company wants to build a dimension table in an Amazon S3 bucket. The bucket contains historical data that includes 10 million records. The historical data is 1 TB in size.

A data engineer needs a solution to update changes for up to 10,000 records in the base table every day.

Which solution will meet this requirement with the LOWEST runtime?

Develop an Apache Spark job in Amazon EMR to read the historical data and the new changes into two Spark DataFrames. Use the Spark update method to update the base table.

Develop an AWS Glue Python job to read the historical data and new changes into two Pandas DataFrames. Use the Pandas update method to update the base table.

Develop an AWS Glue Apache Spark job to read the historical data and new changes into two Spark DataFrames. Use the Spark update method to update the base table.

Develop an Amazon EMR job to read new changes into Apache Spark DataFrames. Use the Apache Hudi framework to create the base table in Amazon S3. Use the Spark update method to update the base table.

Data-Engineer-Associate PDF/Engine

Printable Format
Value of Money
100% Pass Assurance
Verified Answers
Researched by Industry Experts
Based on Real Exams Scenarios
100% Real Questions

Get 65% Discount on All Products, Use Coupon: "ac4s65"

A company has a frontend ReactJS website that uses Amazon API Gateway to invoke REST...

A data engineer needs to schedule a workflow that runs a set of AWS Glue...

Summer Sale Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: ac4s65

A company wants to build a dimension table in an Amazon S3 bucket.

The Answer Is:

Explanation:

Quick Links