Menu
DBCC DROPRESULTSETCACHE – CREATE DATABASE dbName; GO

DBCC DROPRESULTSETCACHE – CREATE DATABASE dbName; GO

To enable caching for a session on an Azure Synapse Analytics SQL pool, you would execute the following command. Caching is OFF by default. SET RESULT_SET_CACHING ON The first time a query is executed, the results are stored in cache. The next time the same query is run, instead of parsing through all the data […]

Querying Data – CREATE DATABASE dbName; GO

Querying Data – CREATE DATABASE dbName; GO

Data is not very useful without some way to look at it, search through it, and manipulate it—in other words, querying. You have seen many examples of managing and manipulating data from both structured and semi‐structured data sources. In this section, you’ll learn many ways to analyze the data in your data lake, data warehouse, […]

Spark Streaming – CREATE DATABASE dbName; GO

Spark Streaming – CREATE DATABASE dbName; GO

The previous chapter introduced you to both Azure Stream Analytics/Event Hubs and Apache Spark/Apache Kafka. Those products are what you use to implement a data streaming solution, as illustrated in Figure 2.20. Notice the various kinds of data producers that can feed into Kafka. Any device that has permission and that can send correctly formatted […]

CREATEGLOBALTEMPVIEW() – CREATE DATABASE dbName; GO

CREATEGLOBALTEMPVIEW() – CREATE DATABASE dbName; GO

This method creates a temporary view, which has a lifetime of the Spark application. If a view with the same name already exists, then an exception is thrown. df.createGlobalTempView(‘Brainwaves’)df2 = spark.sql(‘SELECT Session.POWReading.AF3[0].THETA FROM Brainwaves’) Notice that the argument following FROM is the name of the view created in the previous line of code. CREATEORREPLACEGLOBALTEMPVIEW() This […]

DataFrame – CREATE DATABASE dbName; GO

DataFrame – CREATE DATABASE dbName; GO

Up to this point you have seen examples that created a DataFrame, typically identified as df from a spark.read.* method: df = spark.read.csv(‘/tmp/output/brainjammer/reading.csv’) Instead of passing the data to load into a DataFrame as a path via the read.* method, you could load the data into an object, named data, for example: data =’abfss://<uid>@<accountName>.dfs.core.windows.net/reading.csv’ Once […]

GROUPBY() – CREATE DATABASE dbName; GO

GROUPBY() – CREATE DATABASE dbName; GO

This method provides the ability to run aggregation, which is the gathering, summary, and presentation of data in an easily consumable format. The groupBy() method provides several aggregate functions; here are the most common: avg() Returns the average of grouped columnscount() Returns the number of rows in that identified groupmax() Returns the largest value in […]

TO_DATE() AND TO_TIMESTAMP() – CREATE DATABASE dbName; GO

TO_DATE() AND TO_TIMESTAMP() – CREATE DATABASE dbName; GO

There can be many challenges when working with dates and datetimes. In many scenarios a date is stored as a string. That means if you want to perform any calculation with it, the date value stored in the string needs to be converted to the date data type. Additionally, the date format is often specific […]

SDKs – CREATE DATABASE dbName; GO

SDKs – CREATE DATABASE dbName; GO

Coding, especially with C#, is not a big part of the DP‐203 exam, but knowing about the available SDKs might come up. Table 2.7 provides an overview of the most relevant SDKs in the scope of the DP‐203 exam. A complete list of all Azure SDKs for .NET can be found at https://docs.microsoft.com/dotnet/azure/sdk/packages. TABLE 2.7 […]

Data Programming and Querying for Data Engineers – CREATE DATABASE dbName; GO

Data Programming and Querying for Data Engineers – CREATE DATABASE dbName; GO

To perform the duties of an Azure data engineer, you will need to write some code. Perhaps you will not need to have a great understanding of encapsulation, asynchronous patterns, or parallel LINQ queries, but some coding skill is necessary. Up to this point you have been exposed primarily to SQL syntax and PySpark, which […]

Data Skew – CREATE DATABASE dbName; GO

Data Skew – CREATE DATABASE dbName; GO

When data is skewed, it means that one category is represented more often when compared to the other data categories in a given dataset. Take Figure 2.19, which represents a right/positive skew, no skew, and a left/negative skew for the BCI electrodes. You might notice that the graph in the middle, with no skew, is […]