A company wants to use Apache Spark jobs that run on an Amazon EMR cluster...

Amazon Web Services Data-Engineer-Associate Full Course Access

Amazon Web Services Data-Engineer-Associate View All Questions

Amazon Web Services Data-Engineer-Associate Question Answer

A company wants to use Apache Spark jobs that run on an Amazon EMR cluster to process streaming data. The Spark jobs will transform and store the data in an Amazon S3 bucket. The company will use Amazon Athena to perform analysis.

The company needs to optimize the data format for analytical queries.

Which solutions will meet these requirements with the SHORTEST query times? (Select TWO.)

Use Avro format. Use AWS Glue Data Catalog to track schema changes.

Use ORC format. Use AWS Glue Data Catalog to track schema changes.

Use Apache Parquet format. Use an external Amazon DynamoDB table to track schema changes.

Use Apache Parquet format. Use AWS Glue Data Catalog to track schema changes.

Use ORC format. Store schema definitions in separate files in Amazon S3.

Data-Engineer-Associate PDF/Engine

Printable Format
Value of Money
100% Pass Assurance
Verified Answers
Researched by Industry Experts
Based on Real Exams Scenarios
100% Real Questions

Get 65% Discount on All Products, Use Coupon: "ac4s65"

A data engineer is building a data pipeline.

A company needs to build a data lake in AWS.

Summer Sale Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: ac4s65

A company wants to use Apache Spark jobs that run on an Amazon EMR cluster...

The Answer Is:

Explanation:

Quick Links