This page was exported from Actual Test Materials [ http://blog.actualtests4sure.com ] Export date:Fri Nov 15 20:37:30 2024 / +0000 GMT ___________________________________________________ Title: Get Real DP-203 Quesions Pass Microsoft Certification Exams Easily [Q73-Q96] --------------------------------------------------- Get Real DP-203 Quesions Pass Microsoft Certification Exams Easily DP-203 Dumps are Available for Instant Access Q73. You plan to develop a dataset named Purchases by using Azure databricks Purchases will contain the following columns:* ProductID* ItemPrice* lineTotal* Quantity* StorelD* Minute* Month* Hour* Year* DayYou need to store the data to support hourly incremental load pipelines that will vary for each StoreID. the solution must minimize storage costs. How should you complete the rode? To answer, select the appropriate options In the answer area.NOTE: Each correct selection is worth one point. ExplanationBox 1: partitionByWe should overwrite at the partition level.Example:df.write.partitionBy(“y”,”m”,”d”)mode(SaveMode.Append)parquet(“/data/hive/warehouse/db_name.db/” + tableName)Box 2: (“StoreID”, “Year”, “Month”, “Day”, “Hour”, “StoreID”)Box 3: parquet(“/Purchases”)Reference:https://intellipaat.com/community/11744/how-to-partition-and-write-dataframe-in-spark-without-deleting-partitioQ74. You are designing an application that will store petabytes of medical imaging data When the data is first created, the data will be accessed frequently during the first week. After one month, the data must be accessible within 30 seconds, but files will be accessed infrequently. After one year, the data will be accessed infrequently but must be accessible within five minutes.You need to select a storage strategy for the data. The solution must minimize costs.Which storage tier should you use for each time frame? To answer, select the appropriate options in the answer area.NOTE: Each correct selection is worth one point. Reference:https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blob-storage-tiersQ75. You develop a dataset named DBTBL1 by using Azure Databricks.DBTBL1 contains the following columns:SensorTypeIDGeographyRegionIDYearMonthDayHourMinuteTemperatureWindSpeedOtherYou need to store the data to support daily incremental load pipelines that vary for each GeographyRegionID. The solution must minimize storage costs.How should you complete the code? To answer, select the appropriate options in the answer area.NOTE: Each correct selection is worth one point. Q76. You have two fact tables named Flight and Weather. Queries targeting the tables will be based on the join between the following columns.You need to recommend a solution that maximizes query performance.What should you include in the recommendation?  In the tables use a hash distribution of ArrivalDateTime and ReportDateTime.  In the tables use a hash distribution of ArrivaIAirportID and AirportlD.  In each table, create an identity column.  In each table, create a column as a composite of the other two columns in the table. ExplanationHash-distribution improves query performance on large fact tables.Q77. You have a table named SalesFact in an enterprise data warehouse in Azure Synapse Analytics. SalesFact contains sales data from the past 36 months and has the following characteristics:Is partitioned by monthContains one billion rowsHas clustered columnstore indexesAt the beginning of each month, you need to remove data from SalesFact that is older than 36 months as quickly as possible.Which three actions should you perform in sequence in a stored procedure? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order. Reference:https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-partitionQ78. Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.You have an Azure Synapse Analytics dedicated SQL pool that contains a table named Table1.You have files that are ingested and loaded into an Azure Data Lake Storage Gen2 container named container1.You plan to insert data from the files in container1 into Table1 and transform the dat a. Each row of data in the files will produce one row in the serving layer of Table1.You need to ensure that when the source data files are loaded to container1, the DateTime is stored as an additional column in Table1.Solution: You use a dedicated SQL pool to create an external table that has an additional DateTime column.Does this meet the goal?  Yes  No Instead use the derived column transformation to generate new columns in your data flow or to modify existing fields.Reference:https://docs.microsoft.com/en-us/azure/data-factory/data-flow-derived-columnQ79. You have an Azure subscription that contains the following resources:* An Azure Active Directory (Azure AD) tenant that contains a security group named Group1.* An Azure Synapse Analytics SQL pool named Pool1.You need to control the access of Group1 to specific columns and rows in a table in Pool1 Which Transact-SQL commands should you use? To answer, select the appropriate options in the answer area.NOTE: Each appropriate options in the answer area. Q80. A company purchases IoT devices to monitor manufacturing machinery. The company uses an IoT appliance to communicate with the IoT devices.The company must be able to monitor the devices in real-time.You need to design the solution.What should you recommend?  Azure Stream Analytics cloud job using Azure PowerShell  Azure Analysis Services using Azure Portal  Azure Data Factory instance using Azure Portal  Azure Analysis Services using Azure PowerShell Stream Analytics is a cost-effective event processing engine that helps uncover real-time insights from devices, sensors, infrastructure, applications and data quickly and easily.Monitor and manage Stream Analytics resources with Azure PowerShell cmdlets and powershell scripting that execute basic Stream Analytics tasks.Reference:https://cloudblogs.microsoft.com/sqlserver/2014/10/29/microsoft-adds-iot-streaming-analytics-data-production-and-workflow-services-to-azure/Q81. What should you do to improve high availability of the real-time data processing solution?  Deploy identical Azure Stream Analytics jobs to paired regions in Azure.  Deploy a High Concurrency Databricks cluster.  Deploy an Azure Stream Analytics job and use an Azure Automation runbook to check the status of the job and to start the job if it stops.  Set Data Lake Storage to use geo-redundant storage (GRS). ExplanationGuarantee Stream Analytics job reliability during service updatesPart of being a fully managed service is the capability to introduce new service functionality and improvements at a rapid pace. As a result, Stream Analytics can have a service update deploy on a weekly (or more frequent) basis. No matter how much testing is done there is still a risk that an existing, running job may break due to the introduction of a bug. If you are running mission critical jobs, these risks need to be avoided. You can reduce this risk by following Azure’s paired region model.Scenario: The application development team will create an Azure event hub to receive real-time sales data, including store number, date, time, product ID, customer loyalty number, price, and discount amount, from the point of sale (POS) system and output the data to data storage in Azure Reference:https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-job-reliabilityQ82. The following code segment is used to create an Azure Databricks cluster.For each of the following statements, select Yes if the statement is true. Otherwise, select No.NOTE: Each correct selection is worth one point. ExplanationGraphical user interface, text, application Description automatically generatedBox 1: YesA cluster mode of ‘High Concurrency’ is selected, unlike all the others which are ‘Standard’. This results in a worker type of Standard_DS13_v2.Box 2: NoWhen you run a job on a new cluster, the job is treated as a data engineering (job) workload subject to the job workload pricing. When you run a job on an existing cluster, the job is treated as a data analytics (all-purpose) workload subject to all-purpose workload pricing.Box 3: YesDelta Lake on Databricks allows you to configure Delta Lake based on your workload patterns.Reference:https://adatis.co.uk/databricks-cluster-sizing/https://docs.microsoft.com/en-us/azure/databricks/jobshttps://docs.databricks.com/administration-guide/capacity-planning/cmbp.htmlhttps://docs.databricks.com/delta/index.htmlQ83. Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.After you answer a question in this scenario, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.You have an Azure Storage account that contains 100 GB of files. The files contain text and numerical values. 75% of the rows contain description data that has an average length of 1.1 MB.You plan to copy the data from the storage account to an Azure SQL data warehouse.You need to prepare the files to ensure that the data copies quickly.Solution: You modify the files to ensure that each row is more than 1 MB.Does this meet the goal?  Yes  No ExplanationInstead modify the files to ensure that each row is less than 1 MB.References:https://docs.microsoft.com/en-us/azure/sql-data-warehouse/guidance-for-loading-dataQ84. What should you recommend using to secure sensitive customer contact information?  data labels  column-level security  row-level security  Transparent Data Encryption (TDE) Scenario: All cloud data must be encrypted at rest and in transit.Always Encrypted is a feature designed to protect sensitive data stored in specific database columns from access (for example, credit card numbers, national identification numbers, or data on a need to know basis). This includes database administrators or other privileged users who are authorized to access the database to perform management tasks, but have no business need to access the particular data in the encrypted columns. The data is always encrypted, which means the encrypted data is decrypted only for processing by client applications with access to the encryption key.Reference:https://docs.microsoft.com/en-us/azure/sql-database/sql-database-security-overviewQ85. You have an Azure subscription that contains a logical Microsoft SQL server named Server1. Server1 hosts an Azure Synapse Analytics SQL dedicated pool named Pool1.You need to recommend a Transparent Data Encryption (TDE) solution for Server1. The solution must meet the following requirements:* Track the usage of encryption keys.* Maintain the access of client apps to Pool1 in the event of an Azure datacenter outage that affects the availability of the encryption keys.What should you include in the recommendation? To answer, select the appropriate options in the answer area.NOTE: Each correct selection is worth one point. ExplanationBox 1: TDE with customer-managed keysCustomer-managed keys are stored in the Azure Key Vault. You can monitor how and when your key vaults are accessed, and by whom. You can do this by enabling logging for Azure Key Vault, which saves information in an Azure storage account that you provide.Box 2: Create and configure Azure key vaults in two Azure regionsThe contents of your key vault are replicated within the region and to a secondary region at least 150 miles away, but within the same geography to maintain high durability of your keys and secrets.Reference:https://docs.microsoft.com/en-us/azure/synapse-analytics/security/workspaces-encryptionhttps://docs.microsoft.com/en-us/azure/key-vault/general/loggingQ86. Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.You have an Azure Synapse Analytics dedicated SQL pool that contains a table named Table1.You have files that are ingested and loaded into an Azure Data Lake Storage Gen2 container named container1.You plan to insert data from the files in container1 into Table1 and transform the data. Each row of data in the files will produce one row in the serving layer of Table1.You need to ensure that when the source data files are loaded to container1, the DateTime is stored as an additional column in Table1.Solution: You use an Azure Synapse Analytics serverless SQL pool to create an external table that has an additional DateTime column.Does this meet the goal?  Yes  No ExplanationInstead use the derived column transformation to generate new columns in your data flow or to modify existing fields.Reference:https://docs.microsoft.com/en-us/azure/data-factory/data-flow-derived-columnQ87. From a website analytics system, you receive data extracts about user interactions such as downloads, link clicks, form submissions, and video plays.The data contains the following columns.You need to design a star schema to support analytical queries of the data. The star schema will contain four tables including a date dimension.To which table should you add each column? To answer, select the appropriate options in the answer area.NOTE: Each correct selection is worth one point. ExplanationTable Description automatically generatedBox 1: DimEventBox 2: DimChannelBox 3: FactEventsFact tables store observations or events, and can be sales orders, stock balances, exchange rates, temperatures, etc Reference:https://docs.microsoft.com/en-us/power-bi/guidance/star-schemaQ88. You have a table in an Azure Synapse Analytics dedicated SQL pool. The table was created by using the following Transact-SQL statement.You need to alter the table to meet the following requirements:* Ensure that users can identify the current manager of employees.* Support creating an employee reporting hierarchy for your entire company.* Provide fast lookup of the managers’ attributes such as name and job title.Which column should you add to the table?  [ManagerEmployeeID] [int] NULL  [ManagerEmployeeID] [smallint] NULL  [ManagerEmployeeKey] [int] NULL  [ManagerName] [varchar](200) NULL ExplanationUse the same definition as the EmployeeID column.Reference:https://docs.microsoft.com/en-us/analysis-services/tabular-models/hierarchies-ssas-tabularQ89. You have an enterprise data warehouse in Azure Synapse Analytics that contains a table named FactOnlineSales. The table contains data from the start of 2009 to the end of 2012.You need to improve the performance of queries against FactOnlineSales by using table partitions. The solution must meet the following requirements:Create four partitions based on the order date.Ensure that each partition contains all the orders places during a given calendar year.How should you complete the T-SQL command? To answer, select the appropriate options in the answer area.NOTE: Each correct selection is worth one point. Reference:https://docs.microsoft.com/en-us/sql/t-sql/statements/create-partition-function-transact-sql?view=sql-server-ver15Q90. Which Azure Data Factory components should you recommend using together to import the daily inventory data from the SQL server to Azure Data Lake Storage? To answer, select the appropriate options in the answer area.NOTE: Each correct selection is worth one point. ExplanationBox 1: Self-hosted integration runtimeA self-hosted IR is capable of running copy activity between a cloud data stores and a data store in private network.Box 2: Schedule triggerSchedule every 8 hoursBox 3: Copy activityScenario:* Customer data, including name, contact information, and loyalty number, comes from Salesforce and can be imported into Azure once every eight hours. Row modified dates are not trusted in the source table.* Product data, including product ID, name, and category, comes from Salesforce and can be imported into Azure once every eight hours. Row modified dates are not trusted in the source table.Q91. You have an Azure Data Lake Storage Gen2 account that contains a JSON file for customers. The file contains two attributes named FirstName and LastName.You need to copy the data from the JSON file to an Azure Synapse Analytics table by using Azure Databricks.A new column must be created that concatenates the FirstName and LastName values.You create the following components:* A destination table in Azure Synapse* An Azure Blob storage container* A service principalWhich five actions should you perform in sequence next in is Databricks notebook? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order. Explanation1) mount onto DBFS2) read into data frame3) transform data frame4) specify temporary folder5) write the results to table in in Azure Synapsehttps://docs.databricks.com/data/data-sources/azure/azure-datalake-gen2.htmlhttps://docs.microsoft.com/en-us/azure/databricks/scenarios/databricks-extract-load-sql-data-warehouseQ92. You have files and folders in Azure Data Lake Storage Gen2 for an Azure Synapse workspace as shown in the following exhibit.You create an external table named ExtTable that has LOCATION=’/topfolder/’.When you query ExtTable by using an Azure Synapse Analytics serverless SQL pool, which files are returned?  File2.csv and File3.csv only  File1.csv and File4.csv only  File1.csv, File2.csv, File3.csv, and File4.csv  File1.csv only To run a T-SQL query over a set of files within a folder or set of folders while treating them as a single entity or rowset, provide a path to a folder or a pattern (using wildcards) over a set of files or folders.Reference:https://docs.microsoft.com/en-us/azure/synapse-analytics/sql/query-data-storage#query-multiple-files-or-foldersQ93. You use PySpark in Azure Databricks to parse the following JSON input.You need to output the data in the following tabular format.How should you complete the PySpark code? To answer, drag the appropriate values to he correct targets. Each value may be used once, more than once or not at all. You may need to drag the split bar between panes or scroll to view content.NOTE: Each correct selection is worth one point. Q94. You are building an Azure Stream Analytics job to retrieve game data.You need to ensure that the job returns the highest scoring record for each five-minute time interval of each game.How should you complete the Stream Analytics query? To answer, select the appropriate options in the answer area.NOTE: Each correct selection is worth one point. ExplanationBox 1: TopOne OVER(PARTITION BY Game ORDER BY Score Desc)TopOne returns the top-rank record, where rank defines the ranking position of the event in the window according to the specified ordering. Ordering/ranking is based on event columns and can be specified in ORDER BY clause.Box 2: Hopping(minute,5)Hopping window functions hop forward in time by a fixed period. It may be easy to think of them as Tumbling windows that can overlap and be emitted more often than the window size. Events can belong to more than one Hopping window result set. To make a Hopping window the same as a Tumbling window, specify the hop size to be the same as the window size.A picture containing timeline Description automatically generatedReference:https://docs.microsoft.com/en-us/stream-analytics-query/topone-azure-stream-analyticshttps://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-window-functionsQ95. You have a SQL pool in Azure Synapse.You plan to load data from Azure Blob storage to a staging table. Approximately 1 million rows of data will be loaded daily. The table will be truncated before each daily load.You need to create the staging table. The solution must minimize how long it takes to load the data to the staging table.How should you configure the table? To answer, select the appropriate options in the answer area.NOTE: Each correct selection is worth one point. Reference:https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-partitionhttps://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-distributeQ96. You are designing an Azure Synapse Analytics dedicated SQL pool.You need to ensure that you can audit access to Personally Identifiable information (PII).What should you include in the solution?  dynamic data masking  row-level security (RLS)  sensitivity classifications  column-level security  Loading … Get Instant Access REAL DP-203 DUMP Pass Your Exam Easily: https://www.actualtests4sure.com/DP-203-test-questions.html --------------------------------------------------- Images: https://blog.actualtests4sure.com/wp-content/plugins/watu/loading.gif https://blog.actualtests4sure.com/wp-content/plugins/watu/loading.gif --------------------------------------------------- --------------------------------------------------- Post date: 2022-08-17 15:08:34 Post date GMT: 2022-08-17 15:08:34 Post modified date: 2022-08-17 15:08:34 Post modified date GMT: 2022-08-17 15:08:34