JOIN
SELECT...|UPDATE...|DELETE...table[INNER|CROSS] JOINtable[ONcondition|USING (column[,...])] |tableSTRAIGHT_JOINtableONcondition|tableLEFT [OUTER] JOINtable{ONcondition|USING (column[,...])} |tableNATURAL [LEFT [OUTER]] JOINtable| [OJtableLEFT OUTER JOINtableONcondition] |tableRIGHT [OUTER] JOINtable{ONcondition|USING (column[,...])} |tableNATURAL [RIGHT [OUTER]] JOINtable
The JOIN clause is common to several SQL statements (SELECT,
UPDATE, DELETE) and is complex;
therefore, it is listed here as its own entry in the chapter. Use
JOIN to link tables together based on columns with
common data for purposes of selecting, updating, or deleting data. The
JOIN clause is entered at the place in the relevant
statement that specifies the tables to be referenced. This precludes
the need to join the tables based on key columns in the
WHERE clause.
The ON keyword is used to indicate the pair of columns by which the tables
are to be joined (indicated with the equals sign operator). As an
alternative method, the USING keyword may be given
along with a comma-separated list of columns both tables have in
common, contained within parentheses. The columns must exist in each
table that is joined. To improve performance, you can also provide
index hints to MySQL (see the last subsection of this clause
definition, Index hints”).
Here is an example of a JOIN:
SELECT CONCAT(name_first, SPACE(1), name_last) AS Name FROM employees JOIN branches ON employees.branch_id = branches.branch_id WHERE location = 'New Orleans';
This statement displays a list of employees from the
employees table who are located in the New Orleans
branch office. The problem solved by the JOIN is
that the employees table doesn’t indicate New
Orleans by name as the branch; that table just has a numeric
identifier. The branches table is used to retrieve
the branch name for the WHERE clause. The
location column is a column in the
branches table. Nothing is actually displayed from
the branches table here. Since the record
identification column for branches is
branch_id in both tables, the
USING keyword can be used instead of
ON to create the same join:
SELECT CONCAT(name_first, SPACE(1), name_last) AS Name FROM employees JOIN branches USING (branch_id) WHERE location = 'New Orleans';
This will join the two tables on the
branch_id column in each table. Since these tables
have only one column in common, it’s not necessary to specify that
row; instead, you can use the NATURAL keyword. Here
is the same statement with this change:
SELECT CONCAT(name_first, SPACE(1), name_last) AS Name FROM employees NATURAL JOIN branches WHERE location = 'New Orleans';
Notice that the USING keyword and the column
for linking are omitted. MySQL will assume that
branch_id in both columns are the same and will
naturally join the tables on them. The results of this SQL statement
will be the same as those of the previous two.
When joining two tables in a simple join, as shown in the
previous examples, if no rows in the second table match rows from the
first table, no row will be displayed for the unmatched data. For
example, if the branches table lists a branch
office for which there are no employees listed in the
employees table belonging to that branch, the
results set would not show a row for that supposedly empty branch
office. Sometimes, though, it can be useful to display a record
regardless. In our example, this would tell us that something’s wrong
with the data: either one or more employees are marked with the wrong
branch_id, or some employee records are missing
from the employees table. Conversely, if an
employee has a branch_id value that does not exist
in the branches table, we would want to see it in
the results so that we can correct the data.
To list a row for each employee including stray ones, the LEFT keyword may be given in
front of the JOIN keyword to indicate that records
from the first table listed on the left are to be displayed regardless
of whether there is a matching row in the table on the right:
SELECT CONCAT(name_first, SPACE(1), name_last) AS Name, location AS Branch FROM employees LEFT JOIN branches USING (branch_id) ORDER BY location;
This SQL statement lists a row for each employee along with the
employee’s location. If a row for an employee has either a NULL value
for the branch_id, or a branch number that is not
in the branches table, the employee name will still
be displayed but with the branch name reading as NULL. Again, this can
be useful for spotting errors or inconsistencies in the data between
related tables.
In contrast to LEFT JOIN, the
RIGHT JOIN clause includes all matching entries
from the table on the right even if there are no matches from the
table on the left. Here is an example using a RIGHT
JOIN:
SELECT CONCAT(name_first, SPACE(1), name_last) AS Name, location AS Branch FROM employees RIGHT JOIN branches USING (branch_id) ORDER BY location;
This example displays branches for which there are no matching
employee records. For both the LEFT and
RIGHT JOIN methods, the OUTER
keyword is optional and has no effect on the results. It’s just
a matter of preference and compatibility with other database
engines.
The JOIN clause has a few other options.
The STRAIGHT_JOIN keyword explicitly
instructs MySQL to read the tables as listed, from left to right. The
keywords INNER and
CROSS have no effect on the results, as of recent
versions of MySQL. They cannot be used in conjunction with the
keywords LEFT, RIGHT, or
NATURAL. The syntax starting with the
OJ keyword is provided for compatibility with Open
Database Connectivity (ODBC).
You can use the AS keyword to introduce
aliases for tables. Several examples of aliasing are provided earlier
in the explanation of this clause.
SELECT...|UPDATE...|DELETE...table...JOINtableUSE {INDEX|KEY} [{FOR {JOIN|ORDER BY|GROUP BY}] ([index[,...]]) | FORCE {INDEX|KEY} [{FOR {JOIN|ORDER BY|GROUP BY}] (index[,...]) | IGNORE {INDEX|KEY} [{FOR {JOIN|ORDER BY|GROUP BY}] (index[,...])
When MySQL joins and searches tables, indexes can be used to increase the
speed of the SQL statements. Use the EXPLAIN statement to analyze
a joined SQL statement to see which indexes are being used and in
which order, as well as whether there are other indexes available that
aren’t being used in the join. MySQL may not always choose the best
index available. To hint to MySQL which index it should check first,
and perhaps which index to ignore, or even to force it to use a
particular index, you can employ index
hints.
To tell MySQL to use a particular index, add the USE INDEX clause to the
JOIN along with the names of the indexes in a
comma-separated list, within parentheses. To present an example of
this method, let’s start with a JOIN statement that
may execute in a suboptimal
manner:
SELECT client_name, COUNT(*) AS tickets FROM work_req JOIN clients USING(client_id) WHERE client_type = 1 AND DATEDIFF(NOW(), request_date) < 91 GROUP BY client_id
This statement retrieves a list of support clients and a count
of the number of support tickets that they have created in the last 90
days. It gets the count of tickets from work_req
and the client name from the clients table. To
tweak the performance of the statement, let’s examine the indexes for
the work_req table:
SHOW INDEXES FROM work_req \G
*************************** 1. row ***************************
Table: work_req
Non_unique: 0
Key_name: PRIMARY
Seq_in_index: 1
Column_name: wr_id
Collation: A
Cardinality: 115
Sub_part: NULL
Packed: NULL
Null:
Index_type: BTREE
Comment:
*************************** 2. row ***************************
Table: work_req
Non_unique: 1
Key_name: workreq_date_key
Seq_in_index: 1
Column_name: wr_id
Collation: A
Cardinality: 217337
Sub_part: NULL
Packed: NULL
Null: YES
Index_type: BTREE
Comment:
*************************** 3. row ***************************
Table: work_req
Non_unique: 1
Key_name: workreq_date_key
Seq_in_index: 2
Column_name: request_date
Collation: A
Cardinality: 217337
Sub_part: NULL
Packed: NULL
Null:
Index_type: BTREE
Comment:The results show us that the table work_req
has two indexes: a primary one based on the wr_id
(see row 1) and a second one called
workreq_date_key (see the
Key_name field in rows 2 and 3) based on
wr_id and request_date together.
To suggest to MySQL in our JOIN statement that this
second index should be used, enter the statement like so:
SELECT client_name, COUNT(*) AS tickets FROM work_req JOIN clients USE INDEX FOR JOIN (workreq_date_key) USING(client_id) WHERE client_type = 1 AND DATEDIFF(NOW(), request_date) < 91 GROUP BY client_id;
The FORCE INDEX option instructs MySQL to attempt to limit its search to the
specified index; others, however, will be used if the requested
columns make it necessary:
SELECT client_name, COUNT(*) AS tickets FROM work_req JOIN clients FORCE INDEX FOR JOIN (workreq_date_key) USING(client_id) WHERE client_type = 1 AND DATEDIFF(NOW(), request_date) < 91 GROUP BY client_id;
To instruct MySQL not to use certain indexes, list them with
the IGNORE INDEX option in the same
manner:
SELECT client_name, COUNT(*) AS tickets FROM work_req JOIN clients IGNORE INDEX FOR JOIN (workreq_date_key) USING(client_id) WHERE client_type = 1 AND DATEDIFF(NOW(), request_date) < 91 GROUP BY client_id;
It’s also permitted to use combinations of these three index hint clauses, separated only by a space.