Create Table
CREATE TABLE customers (
first_name VARCHAR(64),
last_name VARCHAR(64),
email VARCHAR(64) UNIQUE,
dob DATE,
since DATE,
customerid VARCHAR(16) PRIMARY KEY, -- applies to the row
country VARCHAR(16)
);
CREATE TABLE games (
name VARCHAR(32),
version CHAR(3),
price NUMERIC NOT NULL,
PRIMARY KEY (name, version)
);
CREATE TABLE downloads (
customerid VARCHAR(16) REFERENCES customers (customerid),
name VARCHAR(32),
version CHAR(3),
FOREIGN KEY (name, version) REFERENCES games (name, version)
);
Identify Primary Key
Column Constraint
CREATE TABLE customers (
first_name VARCHAR(64),
last_name VARCHAR(64),
email VARCHAR(64),
dob DATE,
since DATE,
customerid VARCHAR(16) PRIMARY KEY, -- applies to the row
country VARCHAR(16)
);
Table Constraint
CREATE TABLE customers (
first_name VARCHAR(64),
last_name VARCHAR(64),
email VARCHAR(64),
dob DATE,
since DATE,
customerid VARCHAR(16),
country VARCHAR(16),
PRIMARY KEY (customerid) -- after row
);
NOT NULL
CREATE TABLE games (
name VARCHAR(32),
version CHAR(3),
price NUMERIC NOT NULL,
PRIMARY KEY (name, version)
);
⚠️ Note: A
NOT NULL
constraint guarantees that no value of the corresponding field in any record of the table can be set toNULL
. ANOT NULL
constraint is always declared as a column constraint.
Implicit NULL
INSERT INTO games (name, version) VALUES ('Game 1', '1.0');
Explicit NULL
INSERT INTO games VALUES ('Game 1', '1.0', NULL);
Error
Both statements will be rejected. With a
NOT NULL
constraint, you can’t insert a row without providing a value for the field.
Note
If there is a
default
value, you can insert a row without providing a value for theNOT NULL
field. But still, you can’t insert a row with explicitNULL
.
Foreign Key
-
NULL
Customers (but Not in Games)INSERT INTO downloads VALUES (NULL, 'Aerified', '1.0'); -- Allow (NULL) for (customerid)
-
NULL
Games (but Not in Customers)INSERT INTO downloads VALUES ('1234567890', NULL, '1.0'); -- Allow (NULL, '1.0') for (name, version)
Note
We omit the check if any value in the set to be checked is
NULL
. We check all constraints independently and must satisfy all constraints.
CHECK
Constraint
CREATE TABLE games (
name VARCHAR(32),
version CHAR(3),
price NUMERIC NOT NULL,
CHECK (price > 0),
PRIMARY KEY (name, version)
);
Note
A
CHECK
constraint is declared as a column or a table constraint with theCHECK
keyword followed by the SQL condition in parenthesis. If the condition involves more than one column, it should be a table constraint.
Violate CHECK
Constraint
UPDATE games SET price = price - 5.5;
Error
Discounting all the prices by $5.5 creates negative prices. This operation will be aborted and rolled back.
Cascade UPDATE
and DELETE
CREATE TABLE downloads (
customerid VARCHAR(16)
REFERENCES customers (customerid)
ON UPDATE CASCADE
ON DELETE CASCADE,
name VARCHAR(32),
version CHAR(3),
FOREIGN KEY (name, version)
REFERENCES games (name, version)
ON UPDATE CASCADE
ON DELETE CASCADE
);
ON UPDATE CASCADE
andON DELETE CASCADE
propagate the update or deletion.
CASE
keyword
We can do this but not preferrable.
SELECT name || ' ' || version,
CASE
WHEN price * 0.09 >= 0.3 THEN ROUND(price * 1.09, 2)
ELSE price
END AS price
FROM games;
COALESCE
keyword
COALESCE(x1, x2, x3, ...)
returns the first non-NULL
(from left-to-right) of its argument.
COALESCE(NULL, 1, NULL, 2)
is1
.COALESCE(NULL, NULL, NULL)
isNULL
.
Counting with NULL
COUNT(*)
countsNULL
values.COUNT(att)
,AVG(att)
,MAX(att)
,MIN(attt)
eliminateNULL
values.
JOIN
keyword
CROSS JOIN vs INNER JOIN
While inner join is a popular construct, there is no added expressiveness or performance in INNER JOIN. The two queries below are equivalent.
-- Inner Join
SELECT *
FROM customers c
INNER JOIN downloads d
ON d.customerid = c.customerid
INNER JOIN games g
ON d.name = g.name AND d.version = g.version;
-- Cross Join
SELECT *
FROM customers c, downloads d, games g
WHERE d.customerid = c.customerid
AND d.name = g.name
AND d.version = g.version;
Natural Join
If we managed to give the same name to columns with the same meaning across the tables, we can use natural join. NATURAL JOIN joins rows that have the same values for columns with the same name. It also prints only one of the two columns.
Outer Join
The outer join keeps the columns of the rows in the left (left outer join), the right (right outer join), or in
both (full outer join) tables that do not match anything in the other table according to the join condition.
The remaining values are padded with NULL
values.
It is better to avoid outer joins
They introduce
NULL
values. They can sometimes be justified for efficiency reasons.
Left Outer Join Example
SELECT c.customerid, c.email, d.customerid, d.name, d.version
FROM customers c LEFT OUTER JOIN downloads d ON c.customerid = d.customerid;
This query will show all customers (from the left table), including those who never downloaded a game. For customers with no downloads, the columns from the downloads table (right table) will be filled with NULL
.
Example output:
c.customerid c.email d.customerid d.name d.version
"Willie90" "wlongjj@moonfruit.com" "Willie90" "Ronstring" "1.1"
"Willie90" "wlongjj@moonfruit.com" "Willie90" "Veribet" "2.1"
"Al8" "ahansenp3@webnode.com" null null null
"Johnny1997" "jstevensb0@un.org" null null null
Right Outer Join Example
SELECT *
FROM downloads d RIGHT OUTER JOIN games g
ON g.name = d.name AND g.version = d.version;
This query will show all games (from the right table), including those that have never been downloaded. For games with no downloads, the columns from the downloads table will be filled with NULL
.
Full Outer Join
A full outer join combines both left and right outer joins, padding missing values with NULL
for both tables.
Anti Join Example
To find customers who never downloaded a game:
SELECT c.customerid
FROM customers c
LEFT OUTER JOIN downloads d
ON c.customerid = d.customerid
WHERE d.customerid IS NULL;
Warning
- Outer joins are not easy to write as conditions in the
ON
clause are not equivalent to conditions in theWHERE
clause (unlike withINNER JOIN
).- Conditions in the
ON
clause determine which rows are dangling.- It’s better to avoid outer joins when possible as they introduce
NULL
values, though they can be justified for efficiency reasons.
Note
LEFT JOIN
is a synonym forLEFT OUTER JOIN
RIGHT JOIN
is a synonym forRIGHT OUTER JOIN
FULL JOIN
is a synonym forFULL OUTER JOIN
There are also the natural variant of outer joins.
For instance,
NATURAL LEFT OUTER JOIN
is a natural version of left join. The meaning is the combination of natural join (i.e., automatic equality) and left join (i.e., keeps unmatched column, padded withNULL
)
Set Operations
Basic Set Operators
SQL provides three main set operators:
UNION
: Returns the union of two query resultsINTERSECT
: Returns the intersection of two query resultsEXCEPT
: Returns the non-symmetric difference of two query results
Note
These operators eliminate duplicates by default unless annotated with the keyword
ALL
.
Union Compatibility
Two queries must be union-compatible to use UNION
, INTERSECT
, or EXCEPT
:
- Must return the same number of columns
- Corresponding columns must have the same domain
- Columns must be in the same order
Example of compatible columns:
student.name (VARCHAR(32))
↔department.department (VARCHAR(32))
Example of incompatible columns:
student.year (DATE)
↔department.department (VARCHAR(32))
Examples
Union Example
Find customers who downloaded version 1.0 or 2.0 of Aerified:
SELECT d.customerid
FROM downloads d
WHERE d.name = 'Aerified' AND d.version = '1.0'
UNION
SELECT d.customerid
FROM downloads d
WHERE d.name = 'Aerified' AND d.version = '2.0';
Intersection Example
Find customers who downloaded both version 1.0 and 2.0 of Aerified:
SELECT d.customerid
FROM downloads d
WHERE d.name = 'Aerified' AND d.version = '1.0'
INTERSECT
SELECT d.customerid
FROM downloads d
WHERE d.name = 'Aerified' AND d.version = '2.0';
Difference Example
Find customers who downloaded version 1.0 but not version 2.0 of Aerified:
SELECT d.customerid
FROM downloads d
WHERE d.name = 'Aerified' AND d.version = '1.0'
EXCEPT
SELECT d.customerid
FROM downloads d
WHERE d.name = 'Aerified' AND d.version = '2.0';
Good Practices
Avoid NULL
It is generally a good idea to constrain all attributes to be not NULL unless there is a good design or tuning reason for not doing so.
Justify Cascade
Think carefully about which foreign keys should be subject to cascade.
Defer Constraints
It is generally a good idea to defer all the constraints that can be deferred. A deferred constraint is checked at the end of a transaction and not immediately after each operation.
Play the Devil’s Advocate
Given the complete schema for the three tables, try and find out the scenario in which an operation (insertion, deletion, update) on a table or a transaction containing a set of operation on one or more tables violate which constraint on which table.