What is Transitive Dependency in a Database

Avoid Transitive Dependencies to Help Ensure Normalization

Man using a laptop
Richard Saville/Getty Images

A transitive dependency in a database is an indirect relationship between values in the same table that causes a functional dependency. To achieve the normalization standard of Third Normal Form (3NF), you must eliminate any transitive dependency.

By its nature, a transitive dependency requires three or more attributes (or database columns) that have a functional dependency between them, meaning that Column A in a table relies on Column B through an intermediate Column C.

Let's see how this might work.

Transitive Dependency Example

AUTHORS

Author_IDAuthorBookAuthor_Nationality
Auth_001Orson Scott CardEnder's GameUnited States
Auth_001Orson Scott CardEnder's GameUnited States
Auth_002Margaret AtwoodThe Handmaid's TaleCanada

In the AUTHORS example above:

  • Book Author: Here, the Book attribute determines the Author attribute. If you know the book name, you can learn the author's name. However, Author does not determine Book, because an author can write multiple books. For example, just because we know the author's name Orson Scott Card, we still don't know the book name.
  • Author  Author_Nationality: Likewise, the Author attribute determines the Author_Nationality, but not the other way around; just because we know the nationality does not mean we can determine the author.

But this table introduces a transitive dependency:

  • Book Author_Nationality: If we know the book name, we can determine the nationality via the Author column.

    Avoiding Transitive Dependencies

    To ensure Third Normal Form, let's remove the transitive dependency.

    We can start by removing the Book column from the Authors table and creating a separate Books table:

    BOOKS

    Book_IDBookAuthor_ID
    Book_001Ender's GameAuth_001
    Book_001Children of the MindAuth_001
    Book_002The Handmaid's TaleAuth_002

    AUTHORS

    Author_IDAuthorAuthor_Nationality
    Auth_001Orson Scott CardUnited States
    Auth_002Margaret AtwoodCanada

    Did this fix it? Let's examine our dependencies now:

    BOOKS table:

    • Book_ID  → Book: The Book depends on the Book_ID.
    • No other dependencies in this table exist, so we are okay. Note that the foreign key Author_ID links this table to the AUTHORS table through its primary key Author_ID. We have created a relationship to avoid a transitive dependency, a key design of relational databases.

    AUTHORS table:

    • Author_ID   Author: The Author depends on the Author_ID.
    • Author Author_Nationality: The nationality can be determined by the author.
    • Author_ID  Author_Nationality: The nationality can be determined from the Author_ID through the Author attribute. We still have a transitive dependency. 

    We need to add a third table to normalize this data:

    COUNTRIES

    Country_IDCountry
    Coun_001United States
    Coun_002Canada
     

    AUTHORS

    Author_IDAuthorCountry_ID
    Auth_001Orson Scott CardCoun_001
    Auth_002Margaret AtwoodCoun_002

    Now we have three tables, making use of foreign keys to link between the tables:

    • The BOOK table's foreign key Author_ID links a book to an author in the AUTHORS table.
    • The AUTHORS table's foreign key Country_ID links an author to a country in the COUNTRIES table.
    • The COUNTRIES table has no foreign key because it has no need to link to another table in this design.

    Why Transitive Dependencies Are Bad Database Design

    What is the value of avoiding transitive dependencies to help ensure 3NF?  Let's consider our first table again and see the issues it creates:

    AUTHORS

    Author_IDAuthorBookAuthor_Nationality
    Auth_001Orson Scott CardEnder's GameUnited States
    Auth_001Orson Scott CardChildren of the MindUnited States
    Auth_002Margaret AtwoodThe Handmaid's TaleCanada

    This kind of design can contribute to data anomalies and inconsistencies, for example:

    • If you deleted the two books "Children of the Mind" and "Ender's Game," you would delete the author "Orson Scott Card" and his nationality completely from the database.
    • You cannot add a new author to the database unless you also add a book; what if the author is yet unpublished or you don't know the name of a book she has authored?
    • If "Orson Scott Card" changed his citizenship, you would have to change it in all records in which he appears. Having multiple records with the same author can result in inaccurate data: what if the data entry person doesn't realize there are multiple records for him and changes the data in only one record?
    • You cannot delete a book like "The Handmaid's Tale" without also deleting the author completely.

    These are just some reasons why normalization, and avoiding transitive dependencies, protect data and ensure consistency.

    Was this page helpful?