Weekend Special - 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: spcl70

DP-203 PDF

$42

$139.99

3 Months Free Update

  • Printable Format
  • Value of Money
  • 100% Pass Assurance
  • Verified Answers
  • Researched by Industry Experts
  • Based on Real Exams Scenarios
  • 100% Real Questions

DP-203 PDF + Testing Engine

$57

$189.99

3 Months Free Update

  • Exam Name: Data Engineering on Microsoft Azure
  • Last Update: Dec 8, 2024
  • Questions and Answers: 341
  • Free Real Questions Demo
  • Recommended by Industry Experts
  • Best Economical Package
  • Immediate Access

DP-203 Engine

$48

$159.99

3 Months Free Update

  • Best Testing Engine
  • One Click installation
  • Recommended by Teachers
  • Easy to use
  • 3 Modes of Learning
  • State of Art Technology
  • 100% Real Questions included

DP-203 Practice Exam Questions with Answers Data Engineering on Microsoft Azure Certification

Question # 6

You are creating an Apache Spark job in Azure Databricks that will ingest JSON-formatted data.

You need to convert a nested JSON string into a DataFrame that will contain multiple rows.

Which Spark SQL function should you use?

A.

explode

B.

filter

C.

coalesce

D.

extract

Full Access
Question # 7

You have an Azure Data Lake Storage Gen2 account named account1 that stores logs as shown in the following table.

DP-203 question answer

You do not expect that the logs will be accessed during the retention periods.

You need to recommend a solution for account1 that meets the following requirements:

    Automatically deletes the logs at the end of each retention period

    Minimizes storage costs

What should you include in the recommendation? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 question answer

Full Access
Question # 8

You have an Azure Data Factory pipeline that performs an incremental load of source data to an Azure Data Lake Storage Gen2 account.

Data to be loaded is identified by a column named LastUpdatedDate in the source table.

You plan to execute the pipeline every four hours.

You need to ensure that the pipeline execution meets the following requirements:

    Automatically retries the execution when the pipeline run fails due to concurrency or throttling limits.

    Supports backfilling existing data in the table.

Which type of trigger should you use?

A.

Storage event

B.

on-demand

C.

schedule

D.

tumbling window

Full Access
Question # 9

You are designing a data mart for the human resources (MR) department at your company. The data mart will contain information and employee transactions. From a source system you have a flat extract that has the following fields:

• EmployeeID

• FirstName

• LastName

• Recipient

• GrossArnount

• TransactionID

• GovernmentID

• NetAmountPaid

• TransactionDate

You need to design a start schema data model in an Azure Synapse analytics dedicated SQL pool for the data mart.

Which two tables should you create? Each Correct answer present part of the solution.

A.

a dimension table for employee

B.

a fabric for Employee

C.

a dimension table far EmployeeTransaction

D.

a dimension table for Transaction

E.

a fact table for Transaction

Full Access
Question # 10

You have an Azure Synapse Analytics Apache Spark pool named Pool1.

You plan to load JSON files from an Azure Data Lake Storage Gen2 container into the tables in Pool1. The structure and data types vary by file.

You need to load the files into the tables. The solution must maintain the source data types.

What should you do?

A.

Use a Get Metadata activity in Azure Data Factory.

B.

Use a Conditional Split transformation in an Azure Synapse data flow.

C.

Load the data by using the OPEHROwset Transact-SQL command in an Azure Synapse Anarytics serverless SQL pool.

D.

Load the data by using PySpark.

Full Access
Question # 11

You are monitoring an Azure Stream Analytics job.

The Backlogged Input Events count has been 20 for the last hour.

You need to reduce the Backlogged Input Events count.

What should you do?

A.

Drop late arriving events from the job.

B.

Add an Azure Storage account to the job.

C.

Increase the streaming units for the job.

D.

Stop the job.

Full Access
Question # 12

You have an Azure Synapse Analytics serverless SQ1 pool.

You have an Azure Data Lake Storage account named aols1 that contains a public container named container1 The container 1 container contains a folder named folder 1.

You need to query the top 100 rows of all the CSV files in folder 1.

How shouk1 you complete the query? To answer, drag the appropriate values to the correct targets. Each value may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.

NOTE Each correct selection is worth one point.

DP-203 question answer

Full Access
Question # 13

You are performing exploratory analysis of the bus fare data in an Azure Data Lake Storage Gen2 account by using an Azure Synapse Analytics serverless SQL pool.

You execute the Transact-SQL query shown in the following exhibit.

DP-203 question answer

What do the query results include?

A.

Only CSV files in the tripdata_2020 subfolder.

B.

All files that have file names that beginning with "tripdata_2020".

C.

All CSV files that have file names that contain "tripdata_2020".

D.

Only CSV that have file names that beginning with "tripdata_2020".

Full Access
Question # 14

You have an Azure Synapse Analytics dedicated SQL pool named SQL1 that contains a hash-distributed fact table named Table1.

You need to recreate Table1 and add a new distribution column. The solution must maximize the availability of data.

Which four actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

DP-203 question answer

Full Access
Question # 15

You plan to create a real-time monitoring app that alerts users when a device travels more than 200 meters away from a designated location.

You need to design an Azure Stream Analytics job to process the data for the planned app. The solution must minimize the amount of code developed and the number of technologies used.

What should you include in the Stream Analytics job? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 question answer

Full Access
Question # 16

You have an Azure data factory.

You need to examine the pipeline failures from the last 60 days.

What should you use?

A.

the Activity log blade for the Data Factory resource

B.

the Monitor & Manage app in Data Factory

C.

the Resource health blade for the Data Factory resource

D.

Azure Monitor

Full Access
Question # 17

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You are designing an Azure Stream Analytics solution that will analyze Twitter data.

You need to count the tweets in each 10-second window. The solution must ensure that each tweet is counted only once.

Solution: You use a hopping window that uses a hop size of 10 seconds and a window size of 10 seconds.

Does this meet the goal?

A.

Yes

B.

No

Full Access
Question # 18

In Azure Data Factory, you have a schedule trigger that is scheduled in Pacific Time.

Pacific Time observes daylight saving time.

The trigger has the following JSON file.

DP-203 question answer

Use the drop-down menus to select the answer choice that completes each statement based on the information presented.

NOTE: Each correct selection is worth one point.

DP-203 question answer

Full Access
Question # 19

You build an Azure Data Factory pipeline to move data from an Azure Data Lake Storage Gen2 container to a database in an Azure Synapse Analytics dedicated SQL pool.

Data in the container is stored in the following folder structure.

/in/{YYYY}/{MM}/{DD}/{HH}/{mm}

The earliest folder is /in/2021/01/01/00/00. The latest folder is /in/2021/01/15/01/45.

You need to configure a pipeline trigger to meet the following requirements:

    Existing data must be loaded.

    Data must be loaded every 30 minutes.

    Late-arriving data of up to two minutes must he included in the load for the time at which the data should have arrived.

How should you configure the pipeline trigger? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 question answer

Full Access
Question # 20

Which Azure Data Factory components should you recommend using together to import the daily inventory data from the SQL server to Azure Data Lake Storage? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 question answer

Full Access
Question # 21

What should you do to improve high availability of the real-time data processing solution?

A.

Deploy identical Azure Stream Analytics jobs to paired regions in Azure.

B.

Deploy a High Concurrency Databricks cluster.

C.

Deploy an Azure Stream Analytics job and use an Azure Automation runbook to check the status of the job and to start the job if it stops.

D.

Set Data Lake Storage to use geo-redundant storage (GRS).

Full Access
Question # 22

What should you recommend using to secure sensitive customer contact information?

A.

data labels

B.

column-level security

C.

row-level security

D.

Transparent Data Encryption (TDE)

Full Access
Question # 23

What should you recommend to prevent users outside the Litware on-premises network from accessing the analytical data store?

A.

a server-level virtual network rule

B.

a database-level virtual network rule

C.

a database-level firewall IP rule

D.

a server-level firewall IP rule

Full Access
Question # 24

You need to implement the surrogate key for the retail store table. The solution must meet the sales transaction

dataset requirements.

What should you create?

A.

a table that has an IDENTITY property

B.

a system-versioned temporal table

C.

a user-defined SEQUENCE object

D.

a table that has a FOREIGN KEY constraint

Full Access
Question # 25

You need to ensure that the Twitter feed data can be analyzed in the dedicated SQL pool. The solution must meet the customer sentiment analytics requirements.

Which three Transaction-SQL DDL commands should you run in sequence? To answer, move the appropriate commands from the list of commands to the answer area and arrange them in the correct order.

NOTE: More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.

DP-203 question answer

Full Access
Question # 26

You need to integrate the on-premises data sources and Azure Synapse Analytics. The solution must meet the data integration requirements.

Which type of integration runtime should you use?

A.

Azure-SSIS integration runtime

B.

self-hosted integration runtime

C.

Azure integration runtime

Full Access
Question # 27

You need to implement an Azure Synapse Analytics database object for storing the sales transactions data. The solution must meet the sales transaction dataset requirements.

What solution must meet the sales transaction dataset requirements.

What should you do? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 question answer

Full Access
Question # 28

You need to design a data storage structure for the product sales transactions. The solution must meet the sales transaction dataset requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 question answer

Full Access
Question # 29

You need to design a data ingestion and storage solution for the Twitter feeds. The solution must meet the customer sentiment analytics requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area

NOTE: Each correct selection b worth one point.

DP-203 question answer

Full Access
Question # 30

You need to implement versioned changes to the integration pipelines. The solution must meet the data integration requirements.

In which order should you perform the actions? To answer, move all actions from the list of actions to the answer area and arrange them in the correct order.

DP-203 question answer

Full Access
Question # 31

You need to design a data retention solution for the Twitter feed data records. The solution must meet the customer sentiment analytics requirements.

Which Azure Storage functionality should you include in the solution?

A.

change feed

B.

soft delete

C.

time-based retention

D.

lifecycle management

Full Access
Question # 32

You need to design a data retention solution for the Twitter teed data records. The solution must meet the customer sentiment analytics requirements.

Which Azure Storage functionality should you include in the solution?

A.

time-based retention

B.

change feed

C.

soft delete

D.

Iifecycle management

Full Access
Question # 33

You need to design an analytical storage solution for the transactional data. The solution must meet the sales transaction dataset requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 question answer

Full Access
Question # 34

You need to design the partitions for the product sales transactions. The solution must meet the sales transaction dataset requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 question answer

Full Access