The Fundamentals of SQL

Relational databases use DDL, DML, DCL, and joins to create and report data

The Structured Query Language is one of the fundamental building blocks of modern database architecture. SQL defines the methods used to create and manipulate relational databases on all major platforms. At first glance, the language may seem intimidating and complex, but it's not all that difficult. 

About SQL

The correct pronunciation of SQL is a contentious issue within the database community. In its SQL standard, the American National Standards Institute declared that the official pronunciation is "es queue el." However, many database professionals have taken to the slang pronunciation "sequel." Much as with the pronunciation of GIF, there's no right answer.

SQL comes in many flavors. Oracle databases use its proprietary PL/SQL. Microsoft SQL Server makes use of Transact-SQL. All of the variations are based upon the industry standard ANSI SQL.

This introduction uses ANSI-compliant SQL commands that work on any modern relational database system.

DDL and DML

SQL commands can be divided into two main sub-languages. The Data Definition Language contains the commands used to create and destroy databases and database objects. After the database structure is defined with DDL, database administrators and users can use the Data Manipulation Language to insert, retrieve and modify the data contained within it.

SQL supports a third type of syntax called Data Control Language. DCL governs security access to objects within the database. For example, a DCL script grants or revokes specific user accounts the right to read or write to tables within one or more defined areas of the database. In most managed multi-user environments, database administrators usually execute DCL scripts.

Data Definition Language Commands 

The Data Definition Language is used to create and destroy databases and database objects. These commands are primarily used by database administrators during the setup and removal phases of a database project. DDL revolves around four primary commands—create, use, alter, and drop.

Create

The create command establishes databases, tables, or queries on your platform. For example, the command:

CREATE DATABASE employees;

creates an empty database named employees on your DBMS. After creating the database, the next step is to create tables that contain data. Another variant of the create command accomplishes this purpose. The command:

CREATE TABLE personal_info (first_name char(20) not null, last_name char(20) not null, employee_id int not null);

establishes a table titled personal_info in the current database. In the example, the table contains three attributes: first_namelast_name, and employee_id along with some additional information.

Use

The use command specifies the active database. For example, if you're currently working in the sales database and want to issue some commands that will affect the employee database, preface them with the following SQL command:

USE employees;

Double-check the database you're working in before issuing SQL commands that manipulate data.

Alter

After you've created a table within a database, modify its definition through the alter command, which changes to the structure of a table without deleting and recreating it. Take a look at the following command:

ALTER TABLE personal_info ADD salary money null;

This example adds a new attribute to the personal_info table—an employee's salary. The money argument specifies that an employee's salary stores using a dollars and cents format. Finally, the null keyword tells the database that it's OK for this field to contain no value for any given employee.

Drop

The final command of the Data Definition Language, drop, removes entire database objects from our DBMS. For example, to permanently remove the personal_info table that we created, use the following command:

DROP TABLE personal_info;

Similarly, the command below would be used to remove the entire employee database:

DROP DATABASE employees;

Use this command with care. The drop command removes entire data structures from your database. If you want to remove individual records, use the delete command of the Data Manipulation Language.

Data Manipulation Language Commands

The Data Manipulation Language is used to retrieve, insert and modify database information. These DML commands offer the typical framework for interacting wihtin the database on a routine basis.

Insert

The insert command adds records to an existing table. Returning to the personal_info example from the previous section, imagine that our HR department needs to add a new employee to its database. Use a command similar to this one:

INSERT INTO personal_info
values('bart','simpson',12345,$45000);

Note that there are four values specified for the record. These correspond to the table attributes in the order they were defined: first_name, last_name, employee_id and salary.

Select

The select command is the most commonly used command in SQL. It retrieves specific information from an operational database. Take a look at a few examples, again using the personal_info table from the employee database.

The command shown below retrieves all the information contained within the personal_info table. The asterisk is a wildcard character in SQL.

SELECT *
FROM personal_info;

Alternatively, limit the attributes that are retrieved from the database by specifying what gets selected. For example, the Human Resources department may require a list of the last names of all employees in the company. The following SQL command would retrieve only that information:

SELECT last_name
FROM personal_info;

The where clause limits the records that are retrieved to those that meet specified criteria. The CEO might be interested in reviewing the personnel records of all highly paid employees. The following command retrieves all of the data contained within personal_info for records that have a salary value greater than $50,000:

SELECT *
FROM personal_info
WHERE salary > $50000;

Update

The update command modifies the information contained within a table, either in bulk or individually. Assume the company gives all employees a 3 percent cost-of-living increase in their salary annually. The following SQL command applies this bump to all the employees stored in the database:

UPDATE personal_info
SET salary = salary * 1.03;

When the new employee Bart Simpson demonstrates performance above and beyond the call of duty, management wishes to recognize his stellar accomplishments with a $5,000 raise. The WHERE clause singles out Bart for this raise:

UPDATE personal_info
SET salary = salary + 5000
WHERE employee_id = 12345;

Delete

Finally, let's take a look at the delete command. You'll find that the syntax of this command is similar to that of the other DML commands. The DELETE command, with a where clause, remove a record from a table:

DELETE FROM personal_info
WHERE employee_id = 12345;

DML supports aggregate fields, too. In a select statement, mathematical operators like sum and count summarize data within a query. For example, the query:

select count(*) from personal_info;

counts the number of records in the table.

Database Joins

A join statement combines data in several tables to efficiently process large quantities of data. These statements are where the true power of a database resides.

To explore the use of a basic join operation to combine data from two tables, continue with the example using the personal_info table and add an additional table to the mix. Assume you have a table called disciplinary_action that was created with the following statement:

CREATE TABLE disciplinary_action (action_id int not null, employee_id int not null, comments char(500));

This table contains the results of disciplinary actions for company employees. It doesn't contain any information about the employee other than the employee number. 

Assume you've been tasked with creating a report that lists the disciplinary actions taken against all employees with a salary greater than $40,000. The use of a JOIN operation, in this case, is straightforward. Retrieve this information using the following command:

SELECT personal_info.first_name, personal_info.last_name, disciplinary_action.comments
FROM personal_info INNER JOIN disciplinary_action ON personal_info.employee_id = disciplinary_action.employee_id
WHERE personal_info.salary > 40000;

Types of Joins

Join Types in SQL

Joins come in several flavors. In the SQL statement, the first table (usually called Table A or the Left Table) joins to the second table (usually called Table B or the Right Table) in a position-aware manner. Thus, if you change the order of the tables in the join statement, the results of the operation will differ. The major join types include:

  • Inner Join: Only matches records where the on condition matches the same records in both tables.
  • Outer Join: Only matches records from both tables that exclude the results identified in the on condition.
  • Right Join: Matches all records from Table B plus the records from Table A that match the on condition.
  • Left Join: Matches all records from Table A plus the records from Table B that match the on condition.
  • Cross Join: Matches all records as if the tables were identical. This process generates something called cartesian product. Often, cross-joins are unwelcome, because they match every row of Table A, individually, with every row of Table B. Thus, if Table A offered five records, and Table B offered 9 records, a cross-join query offers 45 resulting rows.