What are null values in DBMS?

A NULL value is a special marker used in SQL to indicate that a data value does not exist in the database. In other words, it is just a placeholder to denote values that are missing or that we do not know. NULL can be confusing and cumbersome at first. But it is imperative for any analyst to know how to handle them. For this reason, I have decided to write some tips and tricks you should know about NULL values.

What are null values in DBMS?

Comparisons and arithmetic operations with a NULL, produce NULL results. 

Is NULL bigger than 10? What is 100 times NULL? Can you divide NULL by 1000? The answer to all of these questions is NULL! If one of your values is NULL, the result will also be NULL… regardless of the other values. Think about it this way: If you don’t know one of the values, you cannot know the answer either.

What are null values in DBMS?


LEN(0) will catch both NULL and "". 

In practice, some of the most common ways to encode unknown text values is to use NULL or as empty strings. In some cases, you might even find data sources that use a mix of both. This is problematic if you want to count how many  NULL values are there in your database. Commands like IS NULL will ignore the empty spaces. Luckily, LEN(0) is here to help. 


"NULL", "NA", "" might be cast into strings.

Unfortunately, it is very common that when you export data from one platform to the other, NULL values and other special markers (e.g. NaN, NA, Inf) might be cast into strings and treated like any other text field. This is particularly problematic because certain commands like IS NULL or LEN(0) won't work anymore, meaning you should always be skeptical and inspect your data. 


Always check your data with a histogram.

Some people have a nasty tendency to impute the mean or an arbitrary value (e.g. 99, 100, 0, -1) directly into the database. If a value appears suspiciously often or seems impossible; it might be an attempt to input NULL values. 

Just to name a couple of examples: I once saw a database where all of the customers who made a purchase before 2005 had an age of 27. In another dataset, I saw time spans of -1 second. 

What are null values in DBMS?

NULL is not always random. 

Removing all of the rows containing a NULL value might not be a wise decision. Especially if there is a systematic reason for which they are NULL. In statistics, there is a distinction between data:

  • Missing completely at random (MCAR).
  • Missing at random (MAR).
  • Missing not at random (MNAR).

The implications of believing that your NULL values are missing completely at random can catastrophic for the validity of your analysis. Just to illustrate this, I once saw a dataset where everybody under 18 had a salary NULL. Ignoring those rows would massively increase the mean age of the whole dataset. 

Also noteworthy: In some cases, especially when data is missing not a random, a boolean column indicating if something is NULL might be a good feature for a statistical model.


COUNT(*) vs COUNT(1)

Sometimes you might care about the number of rows and sometimes you might care about the number of rows that are not NULL. For those cases, you can add COUNT(*) and COUNT(1) respectively to your SQL queries. In Tableau, COUNT([Column]) and SUM([Number of Records]) will do the same trick.


Missing rows are not NULL

This might sound obvious to some and confusing to others, but it is imperative to remember that missing rows (i.e. rows that are not present in the data) behave differently than a missing value (i.e. NULL). 

Imagine you have a shop that never opens on Sundays. When you analyze your daily revenue, there won’t be transaction data for any Sunday (i.e. rows that are not present in the data). By the end of a year, you will have data for roughly 313 days (instead of 365 days). 

If you were analyzing data for this shop, there are two things you should know:

  1. The distinct count of days in your fact table (i.e. table that registers the quantitative information regarding the events that occur in your business) will be equal to the number of days with purchases. NOT the amount of days that have passed since your business opened.
  2. Using commands like IFNULL or ISNULL won’t have any effect on missing rows (unless you start from a business calendar and left join your fact data in). Those only affect existing rows with NULL values.

The only way to circumvent these limitations is to left join your fact table with a table containing all of the possible dates. That way you will have a row even for the days without transactions.

Thanks for reading this far, I hope this tip was helpful. If you have any questions, thoughts, or remarks about NULL values, feel free to contact me or any of our colleagues. We are always happy to discuss your use cases.

A field with a NULL value is a field with no value.

If a field in a table is optional, it is possible to insert a new record or update a record without adding a value to this field. Then, the field will be saved with a NULL value.

Note: A NULL value is different from a zero value or a field that contains spaces. A field with a NULL value is one that has been left blank during record creation!

How to Test for NULL Values?

It is not possible to test for NULL values with comparison operators, such as =, <, or <>.

We will have to use the IS NULL and IS NOT NULL operators instead.

IS NULL Syntax

SELECT column_names
FROM table_name
WHERE column_name IS NULL;

IS NOT NULL Syntax

SELECT column_names
FROM table_name
WHERE column_name IS NOT NULL;

Demo Database

Below is a selection from the "Customers" table in the Northwind sample database:

CustomerID CustomerName ContactName Address City PostalCode Country
1 Alfreds Futterkiste Maria Anders Obere Str. 57 Berlin 12209 Germany
2 Ana Trujillo Emparedados y helados Ana Trujillo Avda. de la Constitución 2222 México D.F. 05021 Mexico
3 Antonio Moreno Taquería Antonio Moreno Mataderos 2312 México D.F. 05023 Mexico
4 Around the Horn Thomas Hardy 120 Hanover Sq. London WA1 1DP UK
5 Berglunds snabbköp Christina Berglund Berguvsvägen 8 Luleå S-958 22 Sweden

The IS NULL Operator

The IS NULL operator is used to test for empty values (NULL values).

The following SQL lists all customers with a NULL value in the "Address" field:

SELECT CustomerName, ContactName, AddressFROM Customers

WHERE Address IS NULL;

Try it Yourself »

Tip: Always use IS NULL to look for NULL values.

The IS NOT NULL Operator

The IS NOT NULL operator is used to test for non-empty values (NOT NULL values).

The following SQL lists all customers with a value in the "Address" field:

SELECT CustomerName, ContactName, AddressFROM Customers

WHERE Address IS NOT NULL;

Try it Yourself »