New Year Special - 75% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: ac75sure

A data engineer writes the following code to join two DataFramesdf1anddf2:df1 = spark.

A data engineer writes the following code to join two DataFramesdf1anddf2:

df1 = spark.read.csv("sales_data.csv") # ~10 GB

df2 = spark.read.csv("product_data.csv") # ~8 MB

result = df1.join(df2, df1.product_id == df2.product_id)

Which join strategy will Spark use?

A.

Shuffle join, because AQE is not enabled, and Spark uses a static query plan

B.

Broadcast join, as df2 is smaller than the default broadcast threshold

C.

Shuffle join, as the size difference between df1 and df2 is too large for a broadcast join to work efficiently

D.

Shuffle join because no broadcast hints were provided

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 PDF/Engine
  • Printable Format
  • Value of Money
  • 100% Pass Assurance
  • Verified Answers
  • Researched by Industry Experts
  • Based on Real Exams Scenarios
  • 100% Real Questions
buy now Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 pdf
Get 75% Discount on All Products, Use Coupon: "ac75sure"