Halloween Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: ac4s65

An upstream source writes Parquet data as hourly batches to directories named with the current...

An upstream source writes Parquet data as hourly batches to directories named with the current date. A nightly batch job runs the following code to ingest all data from the previous day as indicated by thedatevariable:

Assume that the fieldscustomer_idandorder_idserve as a composite key to uniquely identify each order.

If the upstream system is known to occasionally produce duplicate entries for a single order hours apart, which statement is correct?

A.

Each write to the orders table will only contain unique records, and only those records without duplicates in the target table will be written.

B.

Each write to the orders table will only contain unique records, but newly written records may have duplicates already present in the target table.

C.

Each write to the orders table will only contain unique records; if existing records with the same key are present in the target table, these records will be overwritten.

D.

Each write to the orders table will only contain unique records; if existing records with the same key are present in the target table, the operation will tail.

E.

Each write to the orders table will run deduplication over the union of new and existing records, ensuring no duplicate records are present.

Databricks-Certified-Professional-Data-Engineer PDF/Engine
  • Printable Format
  • Value of Money
  • 100% Pass Assurance
  • Verified Answers
  • Researched by Industry Experts
  • Based on Real Exams Scenarios
  • 100% Real Questions
buy now Databricks-Certified-Professional-Data-Engineer pdf
Get 65% Discount on All Products, Use Coupon: "ac4s65"