New Year Special - 75% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: ac75sure

A Spark developer wants to improve the performance of an existing PySpark UDF that runs...

A Spark developer wants to improve the performance of an existing PySpark UDF that runs a hash function that is not available in the standard Spark functions library. The existing UDF code is:

import hashlib

import pyspark.sql.functions as sf

from pyspark.sql.types import StringType

def shake_256(raw):

return hashlib.shake_256(raw.encode()).hexdigest(20)

shake_256_udf = sf.udf(shake_256, StringType())

The developer wants to replace this existing UDF with a Pandas UDF to improve performance. The developer changes the definition ofshake_256_udfto this:CopyEdit

shake_256_udf = sf.pandas_udf(shake_256, StringType())

However, the developer receives the error:

What should the signature of theshake_256()function be changed to in order to fix this error?

A.

def shake_256(df: pd.Series) -> str:

B.

def shake_256(df: Iterator[pd.Series]) -> Iterator[pd.Series]:

C.

def shake_256(raw: str) -> str:

D.

def shake_256(df: pd.Series) -> pd.Series:

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 PDF/Engine
  • Printable Format
  • Value of Money
  • 100% Pass Assurance
  • Verified Answers
  • Researched by Industry Experts
  • Based on Real Exams Scenarios
  • 100% Real Questions
buy now Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 pdf
Get 75% Discount on All Products, Use Coupon: "ac75sure"