Using SQLite

SELECT Examples

The SELECT command is very complex, and it can be difficult to see how these different clauses can be fit together into something useful. Some of this will become more obvious in the next chapter, when we look at standard database design practices, but to get you started, we’re going to look at several examples.

All of these examples will use this data:

CREATE TABLE x ( a, b );
INSERT INTO x VALUES ( 1, 'Alice' );
INSERT INTO x VALUES ( 2, 'Bob' );
INSERT INTO x VALUES ( 3, 'Charlie' );

CREATE TABLE y ( c, d );
INSERT INTO y VALUES ( 1, 3.14159 );
INSERT INTO y VALUES ( 1, 2.71828 );
INSERT INTO y VALUES ( 2, 1.61803 );

CREATE TABLE z ( a, e );
INSERT INTO z VALUES ( 1, 100 );
INSERT INTO z VALUES ( 1, 150 );
INSERT INTO z VALUES ( 3, 300 );
INSERT INTO z VALUES ( 9, 900 );

These examples show the sqlite3 command-line tool. The following dot-commands were issued to make the output easier to understand. The last command will cause sqlite3 to print the string [NULL] whenever a NULL is encountered. Normally, a NULL will produce a blank output that is indistinguishable from an empty string:

.headers on
.mode column
.nullvalue [NULL]

This dataset is available on the book’s download page on the O’Reilly website, as both an SQL file and an SQLite database. I suggest you sit down with a copy of sqlite3 and try these commands out. Try experimenting with different variations.

If one of these examples doesn’t quite make sense, just break the SELECT statement down into its individual parts and step through them bit by bit.

Simple SELECTs

Let’s start with a simple select that returns all of the columns and rows in table x. The SELECT * syntax returns all columns by default:

sqlite> SELECT * FROM x;

a           b         
----------  ----------
1           Alice     
2           Bob       
3           Charlie

We can also return expressions, rather than just columns:

sqlite> SELECT d, d*d AS dSquared FROM y;

d           dSquared          
----------  ------------
3.14159     9.8695877281
2.71828     7.3890461584
1.61803     2.6180210809

Simple JOINs

Now some joins. By default, the bare keyword JOIN indicates an INNER JOIN. However, when no additional condition is put on the JOIN, it reverts to a CROSS JOIN. As a result, all three of these queries produce the same results. The last line uses the implicit join notation.

sqlite> SELECT * FROM x JOIN y;
sqlite> SELECT * FROM x CROSS JOIN y;
sqlite> SELECT * FROM x, y;

a           b           c           d         
----------  ----------  ----------  ----------
1           Alice       1           3.14159   
1           Alice       1           2.71828   
1           Alice       2           1.61803   
2           Bob         1           3.14159   
2           Bob         1           2.71828   
2           Bob         2           1.61803   
3           Charlie     1           3.14159   
3           Charlie     1           2.71828   
3           Charlie     2           1.61803

In the case of a cross join, every row in table a is matched to every row in table y. Since both tables had three rows and two columns, the result set has nine rows (3·3) and four columns (2+2).

JOIN...ON

Next, a fairly simple inner join using a basic ON join condition:

sqlite> SELECT * FROM x JOIN y ON a = c;

a           b           c           d         
----------  ----------  ----------  ----------
1           Alice       1           3.14159   
1           Alice       1           2.71828   
2           Bob         2           1.61803

This query still generates four columns, but only those rows that fulfill the join condition are included in the result set.

The following statement requires the columns to be qualified, since both table x and table z have an a column. Notice that two different a columns are returned, one from each source table:

sqlite> SELECT * FROM x JOIN z ON x.a = z.a;

a           b           a           e         
----------  ----------  ----------  ----------
1           Alice       1           100       
1           Alice       1           150       
3           Charlie     3           300

JOIN...USING, NATURAL JOIN

If we use a NATURAL JOIN or the USING syntax, the duplicate column will be eliminated. Since both table x and table z have only column a in common, both of these statements produce the same output:

sqlite> SELECT * FROM x JOIN z USING ( a );
sqlite> SELECT * FROM x NATURAL JOIN z;

a           b           e         
----------  ----------  ----------
1           Alice       100       
1           Alice       150       
3           Charlie     300

OUTER JOIN

A LEFT OUTER JOIN will return the same results as an INNER JOIN, but will also include rows from table x (the left/first table) that were not matched:

sqlite> SELECT * FROM x LEFT OUTER JOIN z USING ( a );

a           b           e         
----------  ----------  ----------
1           Alice       100       
1           Alice       150       
2           Bob         [NULL]    
3           Charlie     300

In this case, the Bob row from table x has no matching row in table z. Those column values normally provided by table z are padded out with NULL, and the row is then included in the result set.

Compound JOIN

It is also possible to JOIN multiple tables together. In this case we join table x to table y, and then join the result to table z:

sqlite> SELECT * FROM x JOIN y ON x.a = y.c LEFT OUTER JOIN z ON y.c = z.a;

a           b           c           d           a           e         
----------  ----------  ----------  ----------  ----------  ----------
1           Alice       1           3.14159     1           100       
1           Alice       1           3.14159     1           150       
1           Alice       1           2.71828     1           100       
1           Alice       1           2.71828     1           150       
2           Bob         2           1.61803     [NULL]      [NULL]

If you don’t see what is going on here, work through the joins one at a time. First look at what FROM x JOIN y ON x.a = y.c will produce (shown in one of the previous examples). Then look at how this result set would combine with table z using a LEFT OUTER JOIN.

Self JOIN

Our last join example shows a self-join, where a table is joined against itself. This creates two unique instances of the same table and necessitates the use of table aliases:

sqlite> SELECT * FROM x AS x1 JOIN x AS x2 ON x1.a + 1 = x2.a;

a           b           a           b
----------  ----------  ----------  ----------
1           Alice       2           Bob
2           Bob         3           Charlie

Also, notice that the join condition is a more arbitrary expression, rather than being a simple test of column references.

WHERE Examples

Moving on, the WHERE clause is used to filter rows. We can pick out a specific row:

sqlite> SELECT * FROM x WHERE b = 'Alice';

a           b         
----------  ----------
1           Alice

Or a range of values:

sqlite> SELECT * FROM y WHERE d BETWEEN 1.0 AND 3.0;

c           d         
----------  ----------
1           2.71828   
2           1.61803

In this case, the WHERE expression references the output column by its assigned name:

sqlite> SELECT c, d, c+d AS sum FROM y WHERE sum < 4.0;

c           d           sum       
----------  ----------  ----------
1           2.71828     3.71828   
2           1.61803     3.61803

GROUP BY Examples

Now let’s look at a few GROUP BY statements. Here we group table z by the a column. Since there are three unique values in z.a, the output has three rows. Only the grouping a=1 has more than one row, however. We can see this in the count() values returned by the second column:

sqlite> SELECT a, count(a) AS count FROM z GROUP BY a;

a           count     
----------  ----------
1           2         
3           1         
9           1

This is a similar query, only now the second output column represents the sum of all the z.e values in each group:

sqlite> SELECT a, sum(e) AS total FROM z GROUP BY a;

a           total     
----------  ----------
1           250       
3           300       
9           900

We can even compute our own average and compare that to the avg() aggregate:

sqlite> SELECT a, sum(e), count(e),
   ...>    sum(e)/count(e) AS expr, avg(e) AS agg
   ...>   FROM z GROUP BY a;

a           sum(e)      count(e)    expr        agg       
----------  ----------  ----------  ----------  ----------
1           250         2           125         125.0     
3           300         1           300         300.0     
9           900         1           900         900.0

A HAVING clause can be used to filter rows based off the results of the sum() aggregation:

sqlite> SELECT a, sum(e) AS total FROM z GROUP BY a HAVING total > 500;

a           total     
----------  ----------
9           900

ORDER BY Examples

The output can also be sorted. Most of these examples already look sorted, but that’s mostly by chance. The ORDER BY clause enforced a specific order:

sqlite> SELECT * FROM y ORDER BY d;

c           d         
----------  ----------
2           1.61803   
1           2.71828   
1           3.14159

Limits and offsets can also be applied to pick out specific rows from an ordered result. Conceptually, these are fairly simple, however.

These tables and queries are available as part of the book download. Feel free to load the data into sqlite3 and try out different queries. Don’t worry about creating SELECT statements that use every available clause. Start with simple queries where you can understand all the steps, then start to combine clauses to build larger and more complex queries.

Table of Contents for Using SQLite