Practice Databricks Databricks-Certified-Professional-Data-Engineer Exam Questions
Page: 1/24 Total 120 Questions
Question No 1
An upstream system has been configured to pass the date for a given batch of data to the Databricks
Jobs API as a parameter. The notebook to be scheduled will use this parameter to load data with the
following code
df = spark.read.format("parquet").load(f"/mnt/source/(date)")
Which code block should be used to create the date Python variable used in the above code block?
Question No 2
The Databricks workspace administrator has configured interactive clusters for each of the data
engineering groups. To control costs, clusters are set to terminate after 30 minutes of inactivity. Each
user should be able to execute workloads against their assigned clusters at any time of the day.
Assuming users have been added to a workspace but not granted any permissions, which of the
following describes the minimal permissions a user would need to start and attach to an already
configured cluster.
Question No 3
When scheduling Structured Streaming jobs for production, which configuration automatically
recovers from query failures and keeps costs low?
Question No 4
The data engineering team has configured a Databricks SQL query and alert to monitor the values in
a Delta Lake table. The recent_sensor_recordings table contains an identifying sensor_id alongside
the timestamp and temperature for the most recent 5 minutes of recordings.
The below query is used to create the alert
The query is set to refresh each minute and always completes in less than 10 seconds. The alert is set
to trigger when mean (temperature) > 120. Notifications are triggered to be sent at most every 1
minute.
If this alert raises notifications for 3 consecutive minutes and then stops, which statement must be
true?
Question No 5
A junior developer complains that the code in their notebook isn't producing the correct results in
the development environment. A shared screenshot reveals that while they're using a notebook
versioned with Databricks Repos, they're using a personal branch that contains old logic. The desired
branch named dev - 2.3.9 is not available from the branch selection dropdown.
Which approach will allow this developer to review the current logic for this notebook?
Page: 1/24 Total 120 Questions
Full Access
Databricks-Certified-Professional-Data-Engineer Exam Features