What is Transitive Dependency in a Database

Avoid Transitive Dependencies to Help Ensure Normalization

Man using a laptop
Richard Saville/Getty Images

A transitive dependency in a database is an indirect relationship between values in the same table that causes a functional dependency. To achieve the normalization standard of Third Normal Form (3NF), you must eliminate any transitive dependency.

By its nature, a transitive dependency requires three or more attributes (or database columns) that have a functional dependency between them, meaning that Column A in a table relies on Column B through an intermediate Column C. Let's see how this might work.

Transitive Dependency Example

AUTHORS

Author_ID Author Book Author_Nationality
Auth_001 Orson Scott Card Ender's Game United States
Auth_001 Orson Scott Card Ender's Game United States
Auth_002 Margaret Atwood The Handmaid's Tale Canada

In the AUTHORS example above:

  • Book Author: Here, the Book attribute determines the Author attribute. If you know the book name, you can learn the author's name. However, Author does not determine Book, because an author can write multiple books. For example, just because we know the author's name Orson Scott Card, we still don't know the book name.
  • Author  Author_Nationality: Likewise, the Author attribute determines the Author_Nationality, but not the other way around; just because we know the nationality does not mean we can determine the author.

But this table introduces a transitive dependency:

  • Book Author_Nationality: If we know the book name, we can determine the nationality via the Author column.

Avoiding Transitive Dependencies

To ensure Third Normal Form, let's remove the transitive dependency.

We can start by removing the Book column from the Authors table and creating a separate Books table:

BOOKS

Book_ID Book Author_ID
Book_001 Ender's Game Auth_001
Book_001 Children of the Mind Auth_001
Book_002 The Handmaid's Tale Auth_002

AUTHORS

Author_ID Author Author_Nationality
Auth_001 Orson Scott Card United States
Auth_002 Margaret Atwood Canada

Did this fix it? Let's examine our dependencies now:

BOOKS table:

  • Book_ID  → Book: The Book depends on the Book_ID.
  • No other dependencies in this table exist, so we are okay. Note that the foreign key Author_ID links this table to the AUTHORS table through its primary key Author_ID. We have created a relationship to avoid a transitive dependency, a key design of relational databases.

AUTHORS table:

  • Author_ID   Author: The Author depends on the Author_ID.
  • Author Author_Nationality: The nationality can be determined by the author.
  • Author_ID  Author_Nationality: The nationality can be determined from the Author_ID through the Author attribute. We still have a transitive dependency. 

We need to add a third table to normalize this data:

COUNTRIES

Country_ID Country
Coun_001 United States
Coun_002 Canada

AUTHORS

Author_ID Author Country_ID
Auth_001 Orson Scott Card Coun_001
Auth_002 Margaret Atwood Coun_002

Now we have three tables, making use of foreign keys to link between the tables:

  • The BOOK table's foreign key Author_ID links a book to an author in the AUTHORS table.
  • The AUTHORS table's foreign key Country_ID links an author to a country in the COUNTRIES table.
  • The COUNTRIES table has no foreign key because it has no need to link to another table in this design.

Why Transitive Dependencies Are Bad Database Design

What is the value of avoiding transitive dependencies to help ensure 3NF?  Let's consider our first table again and see the issues it creates:

AUTHORS

Author_ID Author Book Author_Nationality
Auth_001 Orson Scott Card Ender's Game United States
Auth_001 Orson Scott Card Children of the Mind United States
Auth_002 Margaret Atwood The Handmaid's Tale Canada

This kind of design can contribute to data anomalies and inconsistencies, for example:

  • If you deleted the two books "Children of the Mind" and "Ender's Game," you would delete the author "Orson Scott Card" and his nationality completely from the database.
  • You cannot add a new author to the database unless you also add a book; what if the author is yet unpublished or you don't know the name of a book she has authored?
  • If "Orson Scott Card" changed his citizenship, you would have to change it in all records in which he appears. Having multiple records with the same author can result in inaccurate data: what if the data entry person doesn't realize there are multiple records for him and changes the data in only one record?
  • You cannot delete a book like "The Handmaid's Tale" without also deleting the author completely.

These are just some reasons why normalization, and avoiding transitive dependencies, protect data and ensure consistency.