An engineer wants to join two DataFrames df1 and df2 on the respective employee

Databricks Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Question Answer

An engineer wants to join two DataFrames df1 and df2 on the respective employee_id and emp_id columns:

df1: employee_id INT, name STRING

df2: emp_id INT, department STRING

The engineer uses:

result = df1.join(df2, df1.employee_id == df2.emp_id, how='inner')

What is the behaviour of the code snippet?

The code fails to execute because the column names employee_id and emp_id do not match automatically

The code fails to execute because it must use on='employee_id' to specify the join column explicitly

The code fails to execute because PySpark does not support joining DataFrames with a different structure

The code works as expected because the join condition explicitly matches employee_id from df1 with emp_id from df2

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 PDF/Engine

Get 65% Discount on All Products, Use Coupon: "ac4s65"

27 of 55.

An engineer wants to join two DataFrames df1 and df2 on the respective employee_id and...