To enable caching for a session on an Azure Synapse Analytics SQL pool, you would execute the following command. Caching is OFF by default. SET RESULT_SET_CACHING ON The first time a query is executed, the results are stored in cache. The next time the same query is run, instead of parsing through all the data […]
CREATEGLOBALTEMPVIEW() – CREATE DATABASE dbName; GO
This method creates a temporary view, which has a lifetime of the Spark application. If a view with the same name already exists, then an exception is thrown. df.createGlobalTempView(‘Brainwaves’)df2 = spark.sql(‘SELECT Session.POWReading.AF3[0].THETA FROM Brainwaves’) Notice that the argument following FROM is the name of the view created in the previous line of code. CREATEORREPLACEGLOBALTEMPVIEW() This […]
Data Programming and Querying for Data Engineers – CREATE DATABASE dbName; GO
To perform the duties of an Azure data engineer, you will need to write some code. Perhaps you will not need to have a great understanding of encapsulation, asynchronous patterns, or parallel LINQ queries, but some coding skill is necessary. Up to this point you have been exposed primarily to SQL syntax and PySpark, which […]
Static Schema– CREATE DATABASE dbName; GO
The word static has numerous meanings, and the one that applies is dependent on the context in which it is used. In the database context, the meaning is that once a schema is defined and created, it will not change. You find static schemas in relational (aka structured) databases. If you recall from the previous […]
Temporary Table– CREATE DATABASE dbName; GO
A temporary table is one that is intended to be used only for a given session. For example, if you create a normal table, you expect the table to remain persisted on the database until you purposefully remove it. Each time you log in, you expect that the table is available and queryable. This isn’t […]
Unsupported PolyBase Data Types– CREATE DATABASE dbName; GO
When you’re working with external tables, the following data types are not supported: Unsupported Table Features Here is a list of unsupported Azure Synapse Analytics dedicated SQL pool features: Schema A schema is an organization feature found in a database. Imagine a large relational database where you have over a thousand tables. You would hope […]
Pruning – CREATE DATABASE dbName; GO
If you already know what the term projection means, then you can use that as a basis for the meaning of pruning. You can also use the literal meaning of the word, which involves trimming branches of a tree or a bush. Also, many times there are some stems that simply come out of nowhere […]
HASH – CREATE DATABASE dbName; GO
This distribution model uses a function to make the distribution, as shown in Figure 2.10. For large table sizes, this distribution model delivers the highest query performance. Consider the following snippet, which can be added to the script that creates the READING table: DISTRIBUTION = HASH([ELECTRODE_ID]) This results in the data being deterministically distributed across […]
Data Concepts– CREATE DATABASE dbName; GO
There are many concepts you must be aware, comfortable, and competent with to manage data efficiently. This section covers many data concepts that will not only help you pass the Data Engineering on Microsoft Azure exam, but also help you do the job in the real world. Keep in mind that when discussing relational structure […]
Index – CREATE DATABASE dbName; GO
In its most common use, an index is the place you look for a key term to find a page number for the detailed explanation of that term. You will find an index at the end of this book. If you look for the term Index, you will find this page referencing it. An index […]