Menu

Unstructured– CREATE DATABASE dbName; GO

This kind of data is typically media files like audio, video, or images. There is no available interface for developers to use to query the contents of media files. There are some advancements happening in the Azure Cognitive Services area, where some artificial intelligence (AI) algorithms are able the identify visual or sound patterns. Those patterns can be stored as metadata and used to compare with known patterns to determine the contents of the media file. This is a new area you could pursue if you find it interesting.

Data Types and Management

It is important to know in depth the different kinds of data you will work with and store for analysis. Is the piece of data you use in a WHERE clause an integer, GUID, or string? It makes a difference in not only the column definitions, if used, but also the size of the value and the way in which you make reference to them. Notice the different ways in which you reference those different types in the following snippet:

WHERE count = 1
WHERE name = “Benjamin Perkins”
WHERE id = ‘081EBD8A-6032-4B99-976B-D59215B2BB2A’

You can also see how different the lengths of each are. The sizes of the values are significant when you are working with huge amounts of data. A GUID takes up a lot more space than the integer. Storage space is required to store the data on disk and then in memory when loaded from disk for manipulation or analysis.

Data Types

Table 2.2 contains a list of the most common data types.

TABLE 2.2 Common data types

TypeExampleDescription
binary1–8,000Values between 1 and 8,000
bit1 or 0Either a 1 or 0; nulls can also be stored
blobgoodvibes.mp3An un/semi/structured file
booltrue or falseUsed to compare values
array{‘1’, ‘2’, ‘3’}A group of elements
charaA single number, letter, symbol, or space
datetime2022‐05‐02 08:02:00A date and time value; YYYY‐MM‐DD HH:MM:SS
decimal1.5~ ±1.0 × 10−28 to ± 7.9 × 1028 with 28–29 digits
double1.5~ ±5.0 × 10−324 to ± 1.7 × 1038 with 15–16 digits
float3.14~ ±1.5 × 10−45 to ± 3.4 × 1038 with 7 digits
imagebrainmap.pngAn image of some kind, such as PNG, BMP, GIF
int100A whole number, ±2,147,483,648
json{ “brain” : “ben” }Used to store JSON data
money€100Currency values
ncharNow is the time for…Fixed length up to 4,000 characters
nvarcharTo no one will we…Variable length up to 4,000 characters
nvarchar (max)In the beginning…Variable length up to maximum
stringcsharpguitarAn array of characters
uniqueidentifier081EBD8A‐6032‐…16‐byte GUID
varchar30‐65‐EC‐6F‐C4‐58Use for fixed‐length data, e.g., SSN or MAC
xml<brain>ben</brain>Store XML data

There are a few more than this, but the table represents the most common data types for a database. There are three important items you should note. First, you might notice that the value is the same for a decimal and a double. The difference between the two, however, is in the description. A decimal can contain ~29 digits to the right of the decimal, which makes it more precise than a double, which can contain ~16. Second, VARCHAR and NVARCHAR let you specify a specific length, such as VARCHAR(100) or NVARCHAR(125). You should do this if you know the maximum length of the data that can be added to that column. Specifying the length value will save space compared to using VARCHAR(MAX) or NVARCHAR(MAX) to save space on your database. Lastly, use VARCHAR unless there is a specific need for NVARCHAR that you need to store or work with foreign language special characters.

Leave a Reply

Your email address will not be published. Required fields are marked *