Choosing a Primary Key

Don't use a ZIP code or Social Security number

A business man working at office
Kohei Hara/Digital Vision/Getty Images

Databases depend upon keys to store, sort, and compare or create relationships between records. If you’ve been around databases for a while, you’ve probably heard about different types of keys: primary keys, candidate keys, and foreign keys. When you create a new database table, you’re asked to select one primary key that will uniquely identify each record stored in that table.

Why a Primary Key Is Important

The selection of a primary key is one of the most critical decisions you’ll make in the design of a new database. The most important constraint is that you must ensure that the selected key is unique. If it’s possible that two records—past, present, or future—may share the same value for an attribute, it’s a poor choice for a primary key.

Another important aspect of a primary key is its use by other tables that link to it in a relational database. In this aspect, a primary key acts like the target of a pointer. Because of these interdependencies, a primary key must exist when a record is created, and it can never change. 

Poor Choices for Primary Keys

What some people might consider an obvious choice for a primary key can be a poor choice instead. Here are a few examples:

  • ZIP codes do not make good primary keys for a table of towns. If you’re making a simple lookup table of cities, ZIP code seems to be a logical primary key. However, upon further investigation, you may realize that more than one town shares a ZIP code. For example, the New Jersey cities of Neptune, Neptune City, Tinton Falls, and Wall Township all share the 07753 ZIP code.
  • Social Security numbers do not make good primary keys for many reasons. Most people consider their SSN to be private and do not want it clearly visible to database users. Some people don’t have SSNs—foreigners or immigrants who have never received a Social Security card. SSNs can also be reused after an individual’s death. An individual may have more than one SSN over a lifetime when the Social Security Administration issues a new number in cases of fraud or identity theft.
  • Email addresses are also a poor choice for a primary key. Although they are unique, they can change over time. Furthermore, not everyone has an email address.

    Choosing an Effective Primary Key

    So, what makes a good primary key? In most cases, turn to your database system for support.

    A best practice in database design is to use an internally generated primary key. Your database management system can normally generate a unique identifier that has no meaning outside of the database system. For example, you might use the Microsoft Access AutoNumber data type to create a field called RecordID. The AutoNumber data type automatically increments the field each time you create a record. While the number itself is meaningless, it provides a reliable way to reference an individual record in queries.

    A good primary key is usually short, uses numbers, and avoids special characters or a mix of uppercase and lowercase characters to facilitate rapid database lookups and comparisons.