A data engineer writes the following code to join two DataFramesdf1anddf2:df1 = spark.

Databricks Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Question Answer

A data engineer writes the following code to join two DataFramesdf1anddf2:

df1 = spark.read.csv("sales_data.csv") # ~10 GB

df2 = spark.read.csv("product_data.csv") # ~8 MB

result = df1.join(df2, df1.product_id == df2.product_id)

Which join strategy will Spark use?

Shuffle join, because AQE is not enabled, and Spark uses a static query plan

Broadcast join, as df2 is smaller than the default broadcast threshold

Shuffle join, as the size difference between df1 and df2 is too large for a broadcast join to work efficiently

Shuffle join because no broadcast hints were provided

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 PDF/Engine

Get 65% Discount on All Products, Use Coupon: "ac4s65"