Introduction

Constraints are additional requirements for acceptable values in addition to those provided by data types. They allow you to define narrower conditions for your data than those found in the general purpose data types.

These are often reflections on the specific characteristics of a field based on additional context provided by your applications. For example, an age field might use the int data type to store whole numbers. However, certain ranges of acceptable integers do not make sense as valid ages. For instance, negative integers would not be reasonable in this scenario. We can express this logical requirement in PostgreSQL using constraints.

Where constraints are defined: column vs table constraints

Postgres allows you to create constraints associated with a specific column or with a table in general.

Almost all constraints can be used in both forms without modification:

ConstraintColumnTable
CHECKYesYes
NOT NULLYesNo*
UNIQUEYesYes
PRIMARY KEYYesYes
FOREIGN KEYYesYes

*: NOT NULL cannot be used as a table constraint. However, you can approximate the results by using IS NOT NULL as the statement within a CHECK table constraint.

Let's look at how column and table constraints differ.

Column constraints

Column constraints are constraints attached to a single column. They are used to determine whether a proposed value for a column is valid or not. Column constraints are evaluated after the input is validated against basic type requirements (like making sure a value is a whole number for int columns).

Column constraints are great for expressing requirements that are limited to a single field. They attach the constraint condition directly to the column involved. For instance, we could model the age restriction in a person table by adding a constraint after the column name and data type:

1CREATE TABLE person (
2 . . .
3 age int CHECK (age >= 0),
4 . . .
5);

This snippet defines a person table with one of the columns being an int called age. The age must be greater than or equal to zero. Column constraints are easy to understand because they are added as additional requirements onto the column they affect.

Table constraints

The other type of constraint is called a table constraint. Table constraints can express any restrictions that a column constraint can, but can additionally express restrictions that involve more than one column. Instead of being attached to a specific column, table constraints are defined as a separate component of the table and can reference any of the table's columns.

The column constraint we saw earlier could be expressed as a table constraint like this:

1CREATE TABLE person (
2 . . .
3 age int,
4 . . .
5 CHECK (age >= 0)
6);

The same basic syntax is used, but the constraint is listed separately. To take advantage of the ability for table constraints to introduce compound restrictions, we can use the logical AND operator to join multiple conditions from different columns.

For example, in a banking database, a table called qualified_borrowers might need to check whether individuals have an existing account and the ability to offer collateral in order to qualify for a loan. It might make sense to include both of these in the same check:

1CREATE TABLE qualified_borrowers (
2 . . .
3 account_number int,
4 acceptable_collateral boolean,
5 . . .
6 CHECK (account_number IS NOT NULL AND acceptable_collateral = 't')
7);

Here, we use the CHECK constraint again to check that the account_number is not null and that the loan officer has marked the client as having acceptable collateral by checking the acceptable_collateral column. A table constraint is necessary since multiple columns are being checked.

Creating names for constraints

When you create constraints using the syntax above, PostgreSQL automatically chooses a reasonable, but vague, name. In the case of the qualified_borrowers table above, Postgres would name the constraint qualified_borrowers_check:

1INSERT INTO qualified_borrowers VALUES (123, false);
1ERROR: new row for relation "qualified_borrowers" violates check constraint "qualified_borrowers_check"
2DETAIL: Failing row contains (123, f).

This name gives you information about the table and type of constraint when a constraint is violated. In cases where multiple constraints are present on a table, however, more descriptive names are helpful to help troubleshooting.

You can optionally specify the name for your constraints by preceding the constraint definition with the CONSTRAINT keyword followed by the name.

For example, if you wanted to name the constraint in the qualified_borrowers table loan_worthiness, you could instead define the table like this:

1CREATE TABLE qualified_borrowers (
2 . . .
3 account_number int,
4 acceptable_collateral boolean,
5 . . .
6 CONSTRAINT loan_worthiness CHECK (account_number IS NOT NULL AND acceptable_collateral = 't')
7);

Now, when we violate a constraint, we get our more descriptive label:

1INSERT INTO qualified_borrowers VALUES (123, false);
1ERROR: new row for relation "qualified_borrowers" violates check constraint "loan_worthiness"
2DETAIL: Failing row contains (123, f).

You can name column constraints in the same way:

1CREATE TABLE teenagers (
2 . . .
3 age int CONSTRAINT is_teenager CHECK (age >= 13 AND age <= 19),
4 . . .
5);

PostgreSQL's list of available constraints

Now that we've covered some of the basics of how constraints work, we can take a deeper look at what constraints are available and how they may be used.

Check constraints

Check constraints are a general purpose constraint that allows you to specify an expression involving column or table values that evaluates to a boolean.

You've already seen a few examples of check constraints earlier. Check constraints begin with the keyword CHECK and then provide an expression enclosed in parentheses. For column constraints, this is placed after the data type declaration. For table constraints, these can be placed anywhere after the columns that they interact with are defined.

For example, we can create a film_nominations table that contains films that have been nominated and are eligible for a feature length award for 2019:

1CREATE TABLE film_nominations (
2 title text,
3 director text,
4 release_date date CHECK ('01-01-2019' <= release_date AND release_date <= '12-31-2019'),
5 length int,
6 votes int,
7 CHECK (votes >= 10 AND length >= 40)
8);

We have one column check restraint that checks that the release_date is within 2019. Afterwards, we have a table check constraint ensuring that the film has received enough votes to be nominated and that the length qualifies it for the "feature length" category.

When evaluating check constraints, acceptable values return true. If the new record's values satisfy all type requirements and constraints, the record will be added to the table:

1INSERT INTO film_nominations VALUES (
2 'A great film',
3 'Talented director',
4 '07-16-2019',
5 117,
6 45
7);
1INSERT 0 1

Values that yield false produce an error indicating that the constraint was not satisfied:

1INSERT INTO film_nominations VALUES (
2 'A poor film',
3 'Misguided director',
4 '10-24-2019',
5 128,
6 1
7);
1ERROR: new row for relation "film_nominations" violates check constraint "film_nominations_check"
2DETAIL: Failing row contains (A poor film, Misguided director, 2019-07-16, 128, 1).

In this case, the film has satisfied every condition except for the number of votes required. PostgreSQL rejects the submission since it does not pass the final table check constraint.

Not null constraints

The NOT NULL constraint is much more focused. It guarantees that values within a column are not null. While this is a simple constraint, it is used very frequently.

To mark a column as requiring a non-null value, add NOT NULL after the type declaration:

1CREATE TABLE national_capitals (
2 country text NOT NULL,
3 capital text NOT NULL,
4);

In the above example, we have a simple two column table mapping countries to their national capitals. Since both of these are required fields that would not make sense to leave blank, we add the NOT NULL constraint.

Inserting a null value now results in an error:

1INSERT INTO national_capitals VALUES (
2 NULL,
3 'London',
4);
1ERROR: null value in column "country" violates not-null constraint
2DETAIL: Failing row contains (null, London).

The NOT NULL constraint functions only as a column constraint (it cannot be used as a table constraint). However, you can easily work around this by using IS NOT NULL within a table CHECK constraint.

For example, this offers equivalent guarantees using a table constraint:

1CREATE TABLE national_capitals (
2 country text,
3 capital text,
4 CHECK (country IS NOT NULL AND capital IS NOT NULL)
5);

Unique constraints

The UNIQUE constraint tells PostgreSQL that each value within a column must not be repeated. This is useful in many different scenarios where having the same value in multiple records should be impossible.

For example, columns that deals with IDs of any kind should, by definition, have unique values. A social security number, a student or customer ID, or a product UPC (barcode number) would be useless if they were not able to differentiate between specific people or items.

A UNIQUE constraint can be specified at the column level:

1CREATE TABLE supplies (
2 supply_id integer UNIQUE,
3 name text,
4 inventory integer
5);

They can also be specified as table constraints:

1CREATE TABLE supplies (
2 supply_id integer,
3 name text,
4 inventory integer,
5 UNIQUE (supply_id)
6);

One of the advantages of using UNIQUE table constraints is that it allows you to perform uniqueness checks on a combination of columns. This works by specifying two or more columns that PostgreSQL should evaluate together. The values in individual columns may repeat but the combination of values specified must be unique.

As an example, let's look back at the national_capitals table we used before:

1CREATE TABLE national_capitals (
2 country text NOT NULL,
3 capital text NOT NULL,
4);

If we wanted to make sure that we don't add multiple records for the same pair, we could add UNIQUE constraints to the columns here:

1CREATE TABLE national_capitals (
2 country text NOT NULL UNIQUE,
3 capital text NOT NULL UNIQUE,
4);

This would ensure that both the countries and capitals are only present once in each table. However, some countries have multiple capitals. This would mean we may have multiple entries with the same country value. These wouldn't work with the current design:

1INSERT INTO national_capitals VALUES (
2 'Bolivia',
3 'Sucre'
4);
5INSERT INTO national_capitals VALUES (
6 'Bolivia',
7 'La Paz'
8);
1INSERT 0 1
2ERROR: duplicate key value violates unique constraint "national_capitals_country_key"
3DETAIL: Key (country)=(Bolivia) already exists.

If we still want to make sure we don't end up with duplicate entries while allowing for repeated values in individual columns, a unique check on the combination of country and capital would suffice:

1CREATE TABLE national_capitals (
2 country text,
3 captial text,
4 UNIQUE (country, capital)
5);

Now, we can add both of Bolivia's capitals to the table without an error:

1INSERT INTO national_capitals VALUES (
2 'Bolivia',
3 'Sucre'
4);
5INSERT INTO national_capitals VALUES (
6 'Bolivia',
7 'La Paz'
8);
1INSERT 0 1
2INSERT 0 1

However, attempting to add the same combination twice is still caught by the constraint:

1INSERT INTO national_capitals VALUES (
2 'Bolivia',
3 'Sucre'
4);
5INSERT INTO national_capitals VALUES (
6 'Bolivia',
7 'Sucre'
8);
1INSERT 0 1
2ERROR: duplicate key value violates unique constraint "national_capitals_country_capital_key"
3DETAIL: Key (country, capital)=(Bolivia, Sucre) already exists.

Primary key constraints

The PRIMARY KEY constraint serves a special purpose. It indicates that the column can be used to uniquely identify a record within the table. This means that it must be reliably unique and that every record must have a value in that column.

Primary keys are recommended for every table not required, and every table may only have one primary key. Primary keys are mainly used to identify, retrieve, modify, or delete individual records within a table. They allow users and administrators to target the operation using an identifier that is guaranteed by PostgreSQL to match exactly one record.

Let's use the supplies table we saw before as an example:

1CREATE TABLE supplies (
2 supply_id integer UNIQUE,
3 name text,
4 inventory integer
5);

Here we've identified that the supply_id should be unique. If we wanted to use this column as our primary key (guaranteeing uniqueness and a non-null value), we could simply change the UNIQUE constraint to PRIMARY KEY:

1CREATE TABLE supplies (
2 supply_id integer PRIMARY KEY,
3 name text,
4 inventory integer
5);

This way, if we needed to update the inventory amounts for a specific supply, we could target it using the primary key:

1INSERT INTO supplies VALUES (
2 38,
3 'nails',
4 5
5);
6UPDATE supplies set inventory = 10 WHERE supply_id = 38;
1INSERT 0 1
2UPDATE 1

While many tables use a single column as the primary key, it is also possible to create a primary key using a set of columns, as a table constraint.

The national_capitals table is a good candidate to demonstrate this. If we wanted to create a primary key using the existing columns, we could replace the UNIQUE table constraint with PRIMARY KEY:

1CREATE TABLE national_capitals (
2 country text,
3 captial text,
4 PRIMARY KEY (country, capital)
5);

Foreign keys constraints

Foreign keys are columns within one table that reference column values within another table. This is desirable and often necessary in a variety of scenarios where tables contain related data. This ability for the database to easily connect and reference data stored in separate tables is one of the primary features of relational databases.

For example, you may have a orders table to track individual orders and a customers table to track contact info and information about your customers. It makes sense to put this information separately since customers may have many orders. However, it also makes sense to be able to easily link the records in these two tables to allow more complex operations.

Let's start by trying to model the customers table:

1CREATE TABLE customers (
2 customer_id serial PRIMARY KEY,
3 first_name text,
4 last_name text,
5 phone_number bigint,
6);

This table is pretty simple. It includes columns to store the parent's first name, last name, and phone number. It also specifies an ID column that uses the PRIMARY KEY constraint. The serial data type is used to automatically generate the next ID in the sequence if an ID is not specified.

For the orders table, we want to be able to specify information about individual orders. One essential piece of data is what customer placed the order. We can use a foreign key to link the order to the customer without duplicating information. We do this with the REFERENCES constraint, which defines a foreign key relationship to a column in another table:

1CREATE TABLE orders (
2 order_id serial PRIMARY KEY,
3 order_date date,
4 customer integer REFERENCES customers
5);

Here, we are indicating that the customer column in the orders table has a foreign key relationship with the customers table. Since we do not specify a specific column within the customers table, PostgreSQL assumes that we want to link to the primary key in the customers table: customer_id.

If we try to insert a value into the orders table that doesn't reference a valid customer, PostgreSQL will reject it:

1INSERT INTO orders VALUES (
2 100,
3 '11-19-2019',
4 300
5);
1ERROR: insert or update on table "orders" violates foreign key constraint "orders_customer_fkey"
2DETAIL: Key (customer)=(300) is not present in table "customers".

If we add the customer first, our order will then be accepted by the system:

1INSERT INTO customers VALUES (
2 300,
3 'Jill',
4 'Smith',
5 '5551235677'
6);
7INSERT INTO orders VALUES (
8 100,
9 '11-19-2019',
10 300
11);
1INSERT 0 1
2INSERT 0 1

While the primary key is a great candidate for foreign keys because it guarantees to match only one record, you can also use other columns as long as they're unique. To do so you just have to specify the column in parentheses after the table name in the REFERENCES definition:

1CREATE TABLE example (
2 . . .
3 column type REFERENCES other_table (column)
4);

You can also use sets of columns that are guaranteed unique. To do so, you need to use a table constraint that begins with FOREIGN KEY and refers to columns you've defined earlier in the table description:

1CREATE TABLE example (
2 . . .
3 FOREIGN KEY (column1, column2) REFERENCES other_table (column1, column2)
4);

Deciding what to do with foreign keys when deleting or updating

One consideration you'll need to think about when defining foreign key constraints is what to do when a referenced table is deleted or updated.

As an example, let's look at the customers and orders tables again. We need to specify how we want the system to respond when we delete a customer from the customers table when the customer has an associated order in the orders table.

We can choose between the following options:

  • RESTRICT: Choosing to restrict deletions means that PostgreSQL will refuse to delete the customer record if it's referenced by a record in the orders table. To delete a customer, you will first have to remove any associated records from the orders table. Only then will you be able to remove the value from the customer table.
  • CASCADE: Selecting the cascade option means that when we delete the customer record, the records that reference it in the orders table are also deleted. This is useful in many cases but must be used with care to avoid deleting data by mistake.
  • NO ACTION: The no action option tells PostgreSQL to simply remove the customer and not do anything with the associated orders records. If the constraint is checked later, it will still cause an error, but this won't happen during the initial deletion. This is the default action if no other is specified.
  • SET NULL: This option tells PostgreSQL to set the referencing columns to null when the referenced records are removed. So if we delete a customer from the customers table, the customer column in the orders table will be set to NULL.
  • Set DEFAULT: If this option is chosen, PostgreSQL will change the referencing column to the default value if the referenced record is deleted. So if the customer column in the orders table had a default value and we remove a customer from the customers table, the record in the orders value would be assigned the default value.

These actions can be specified when defining a foreign key constraint by adding ON DELETE followed by the action. So if we want to remove associated orders from our system when a customer is deleted, we could specify that like this:

1CREATE TABLE orders (
2 order_id serial PRIMARY KEY,
3 order_date date,
4 customer integer REFERENCES customers ON DELETE CASCADE
5);

These type of actions can also be applied when updating a referenced column instead of deleting one by using ON UPDATE instead of ON DELETE.

Conclusion

In this guide, we learned about how PostgreSQL's constraints can be used to hone in on what specific values are valid for our tables. We talked about the differences between column and table constraints. Afterwards, we walked through the various types of constraints and demonstrated how to use them to restrict what types of input your tables accept.

Constraints are one of many features that help you define your expectations in your data structures. Once you've provided constraints, you can then allow PostgreSQL to validate that any input matches the requirements. This is one small way to use your database system to enforce guarantees so that your data remains consistent and meaningful.