A data engineer is creating a data ingestion pipeline to understand where customers are taking...

Databricks Databricks-Certified-Professional-Data-Engineer Full Course Access

Databricks Databricks-Certified-Professional-Data-Engineer View All Questions

Databricks Databricks-Certified-Professional-Data-Engineer Question Answer

A data engineer is creating a data ingestion pipeline to understand where customers are taking their rented bicycles during use. The engineer noticed that, over time, data being transmitted from the bicycle sensors fail to include key details like latitude and longitude. Downstream analysts need both the clean records and the quarantined records available for separate processing.

The data engineer already has this code:

import dlt

from pyspark.sql.functions import expr

rules = {

" valid_lat " : " (lat IS NOT NULL) " ,

" valid_long " : " (long IS NOT NULL) "

}

quarantine_rules = " NOT({}) " .format( " AND " .join(rules.values()))

@dlt.view

def raw_trips_data():

return spark.readStream.table( " ride_and_go.telemetry.trips " )

How should the data engineer meet the requirements to capture good and bad data?

@dlt.table(name= " trips_data_quarantine " )

def trips_data_quarantine():

return (

spark.readStream.table( " raw_trips_data " )

.filter(expr(quarantine_rules))

)

@dlt.view

@dlt.expect_or_drop( " lat_long_present " , " (lat IS NOT NULL AND long IS NOT NULL) " )

def trips_data_quarantine():

return spark.readStream.table( " ride_and_go.telemetry.trips " )

@dlt.table

@dlt.expect_all_or_drop(rules)

def trips_data_quarantine():

return spark.readStream.table( " raw_trips_data " )

@dlt.table(partition_cols=[ " is_quarantined " , ])

@dlt.expect_all(rules)

def trips_data_quarantine():

return (

spark.readStream.table( " raw_trips_data " )

.withColumn( " is_quarantined " , expr(quarantine_rules))

)

Databricks-Certified-Professional-Data-Engineer PDF/Engine

Printable Format
Value of Money
100% Pass Assurance
Verified Answers
Researched by Industry Experts
Based on Real Exams Scenarios
100% Real Questions

buy now Databricks-Certified-Professional-Data-Engineer pdf

Get 65% Discount on All Products, Use Coupon: "ac4s65"

A Databricks SQL dashboard has been configured to monitor the total number of records present...

Which of the following technologies can be used to identify key areas of text when...

Pre-Summer Sale Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: ac4s65

A data engineer is creating a data ingestion pipeline to understand where customers are taking...

The Answer Is:

Explanation:

Quick Links