SELECT
SELECT [flags] {*|column|expression}[, ...] FROMtable[, ...] [WHEREcondition] [GROUP BY {column|expression|position}[ASC|DESC], ... [WITH ROLLUP]] [HAVINGcondition] [ORDER BY {column|expression|position}[ASC|DESC] , ...] [LIMIT {[offset,]count|countOFFSEToffset}] [PROCEDUREprocedure(arguments)]options
Use this statement to retrieve and display data from tables
within a database. It has many clauses and options, but for simple
data retrieval many of them can be omitted. The basic syntax for the
statement is shown. After the SELECT keyword, some
keywords to control the whole operation may be given. Next comes an
asterisk to retrieve all columns, a list of columns to retrieve, or
expressions returning values to display, separated by commas.
Data can be retrieved from one or more tables, given in a
comma-separated list. If multiple tables are specified, other clauses
must define how the tables are joined. The remaining clauses may be
called on to refine the data to be retrieved, to order it, and so
forth. These various keywords, options, and clauses are detailed in
subsections of this statement explanation. To start, here is a simple
example of how you can use the SELECT
statement:
SELECT name_first, name_last, telephone_home, DATEDIFF(now( ), last_review) AS 'Days Since Last Review' FROM employees;
In this example, three columns and the results of an expression
based on a fourth column are to be displayed. The first and last name
of each employee, each employee’s home telephone number, and the
difference between the date of the employee’s last employment review
and the date now are listed. This last field has the addition
of the AS keyword to set the column
heading of the results set, and to name an alias for the field. An
alias may be referenced in subsequent clauses of the same statement
(e.g., the ORDER BY clause). To select all columns
in the table, the wildcard * can be given instead
of the column names.
SELECT
[ALL|DISTINCT|DISTINCTROW]
[HIGH_PRIORITY] [STRAIGHT_JOIN]
[SQL_SMALL_RESULT] [SQL_BIG_RESULT] [SQL_BUFFER_RESULT]
[SQL_CACHE|SQL_NO_CACHE] [SQL_CALC_FOUND_ROWS]
{*|column|expression}[, ...]
FROM table[, ...]
[WHERE condition] [other clauses] [options]Between the initial SELECT keyword and list
of columns and expressions, several keywords may be given. They are
shown in the preceding syntax, with the other components of the
statement abbreviated.
When a WHERE clause is used with the SELECT statement,
rows in the results may contain duplicate data. If you want all rows
that meet the selection conditions to be displayed, you may include
the ALL keyword. This is the default, so it’s not
necessary to give this keyword. If you want to display only the first
occurrence of a row, include the DISTINCT keyword or its synonym
DISTINCTROW. Here is an example:
SELECT DISTINCT dept FROM employees;
This statement will list the names of all departments for which
we have employees listed in the employees table.
Even though there are several employees in the same department, it
will list only one row for each department.
By default, any UPDATE statements that are issued have priority over
SELECT statements submitted by other client
sessions at the same time; the updates are run first. To give a
particular SELECT statement higher priority than
any UPDATE statements, use the HIGH_PRIORITY keyword.
Multiple tables may be selected with the
SELECT statement. The column on which they should
be joined is given with the WHERE clause or the
JOIN clause. The JOIN
clause is described earlier in this chapter. For the purposes of this
section, you just need to know that in order to optimize retrieval,
MySQL might not join tables in the order that they are listed in the
SQL statement. To insist on joining in the order given, you must use
the STRAIGHT_JOIN keyword.
When you know that the results of a SELECT
statement using the DISTINCT keyword or the
GROUP BY clause (discussed later) will be small,
you can use the SQL_SMALL_RESULT keyword.
This will cause MySQL to use temporary tables, with a key based on the
GROUP BY clause elements, to sort the results and
possibly make for faster data retrieval. If you expect the results to
be large, you can use the SQL_BIG_RESULT keyword.
This will cause MySQL to use temporary tables on the filesystem.
Regardless of whether you use DISTINCT or GROUP
BY, the SQL_BUFFER_RESULT keyword
may be given for any SELECT statement
to have MySQL use a temporary table to buffer the results. You can use
only one of the SQL_*_RESULT keywords in each
statement.
If the MySQL server is not using the query cache by default, you can force its use by including the
SQL_CACHE keyword. If the server does use the query cache by default, you
can use the SQL_NO_CACHE to instruct MySQL
not to use the cache for this particular
SELECT statement. To determine whether the server
uses query cache by default, enter SHOW VARIABLES LIKE
'query_cache_type';. A value of ON
indicates that it is in use.
The last keyword available is SQL_CALC_FOUND_ROWS,
which counts the number of rows that meet the conditions of the
statement. This is not affected by a LIMIT clause.
The results of this count must be retrieved in a separate
SELECT statement with the FOUND_ROWS() function. See the end of
this chapter for information on this function:
SELECT SQL_CALC_FOUND_ROWS name_first, name_last, telephone_home, DATEDIFF(now( ), last_review) AS 'Days Since Last Review' FROM employees WHERE dept = 'sales' ORDER BY last_review DESC LIMIT 10; SELECT FOUND_ROWS();
The first statement retrieves a list of sales people to review, limited to the 10 who have gone the longest without a performance review. The second gets a count of how many employees there are to review in the sales department.
SELECT [flags] {*|columns|expression}[, ...] [INTO OUTFILE '/path/filename' [FIELDS TERMINATED BY 'character'] [FIELDS ENCLOSED BY 'character'] [ESCAPED BY 'character' ] [LINES [STARTING BY 'character'] [TERMINATED BY 'character']] |INTO DUMPFILE '/path/filename' |INTO 'variable'[, ...] [FOR UPDATE|LOCK IN SHARE MODE]] FROMtable[, ...] [WHEREcondition] [other clauses] [options]
The INTO clause is used to export data from a SELECT
statement to an external text file or a variable. Only the results
will be exported, not the column names or other information.
Various clauses set delimiter and control characters in the output:
ESCAPED BYCharacter used to escape special characters in the output. The default is a backslash.
FIELDS ENCLOSED BYCharacter to use before and after each field. By default, no character is used.
FIELDS TERMINATED BYCharacter with which to separate fields. The default is a tab.
LINES STARTING BYCharacter used to start each line. By default, no character is used.
LINES TERMINATED BYCharacter used to end each line. The default is a newline character.
FILE privilege is necessary to use the
INTO clause of the SELECT
statement. This statement and clause combination is essentially the
counterpart of the LOAD DATA INFILE statement. See the
explanation of that statement earlier in this chapter for more details
on the options for this clause. Here is an example of this clause and
these options:
SELECT * FROM employees INTO OUTFILE '/tmp/employees.txt' FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' ESCAPED BY '\\';
The text file created by this SQL statement will contain a
separate line for each row selected. Each field will end with a
vertical bar. Any special characters (e.g., an apostrophe) will be
preceded by a backslash. Because a backslash is an escape character
within an SQL statement, two backslashes are needed in the
ESCAPE BY clause because the first escapes the
second. To import the resulting data text file, use the FOUND_ROWS() statement.
The second syntax uses the clause INTO DUMPFILE and
exports only one row into an external text file. It does not allow any
field or line terminators like the INTO OUTFILE
clause. Here is an example of its use:
SELECT photograph INTO DUMPFILE '/tmp/bobs_picture.jpeg' FROM employees WHERE emp_id = '1827';
This statement exports the contents of the photograph column for
an employee’s record. It’s a BLOB type column and
contains an image file. The result of the exported file is a complete
and usable image file.
You can also use the INTO clause to store a
value in a user variable or a system variable for reuse. Here’s an
example:
SET @sales = 0; SELECT SUM(total_order) AS Sales INTO @sales FROM orders WHERE YEAR(order_date) = YEAR(CURDATE());
This example creates the user variable
@sales. Then we calculate the total sales for the
current year and store it into that variable for reuse in subsequent
statements in the session.
SELECT [flags] {*|column|expression}[, ...] FROMtable[, ...] [WHEREcondition] [GROUP BY {column|expression|position} [ASC|DESC], ... [WITH ROLLUP]] [other clauses] [options]
A SELECT statement sometimes produces more meaningful results if you group
together rows containing the same value for a particular column. The
GROUP BY clause specifies one or more columns by
which MySQL is to group the data retrieved. This is used with
aggregate functions so that the values of numeric columns for the rows
grouped will be aggregated.
For instance, suppose that a SELECT statement
is to list the sales representatives for a business and their orders
for the month. Without a GROUP BY clause, one line
would be displayed for each sales representative for each order.
Here’s an example of how this might be resolved:
SELECT CONCAT(name_first, ' ', name_last) AS 'Sales Rep.', SUM(total_order) AS 'Sales for Month' FROM orders, employees WHERE employees.emp_id = sales_rep AND MONTH(order_date) = MONTH(CURDATE( )) GROUP BY sales_rep;
This statement concatenates the first and last name of each
sales representative who placed an order for a customer during the
current month. The GROUP BY clause groups together
the rows found for each sales representative. The
SUM() function adds the values of the
total_order column for each row within each group.
See Chapter 10 for more information on the
SUM() function and other aggregate
functions.
You can specify multiple columns in the GROUP
BY clause. Instead of stating a column’s name, you can state
its position in the table, where a value of 1 represents the first
column in the table. Expressions may be given as well.
The GROUP BY clause does its own sorting and cannot be used with the ORDER
BY clause. To set the sorting to ascending order explicitly
for a column, enter the ASC keyword after the
column in the clause that is to be set. This is not necessary, though,
since it is the default setting. To sort in descending order, add
DESC after each column that is to be sorted in
reverse.
When grouping rows by one column, it may be desirable not
only to have a total of the values for certain columns, but also to
display a total for all of the grouped rows at the end of the results
set. To do this, use the WITH ROLLUP keyword. Here
is an example:
SELECT location AS Branch, CONCAT(name_first, ' ', name_last) AS 'Sales Rep.', SUM(total_order) AS 'Sales for Month' FROM orders, employees, branches WHERE sales_rep = employees.emp_id AND MONTH(order_date) = MONTH(CURDATE( )) AND employees.branch_id = branches.branch_id GROUP BY Branch, sales_rep WITH ROLLUP; +---------------+-----------------+-----------------+ | Branch | Sales Rep. | Sales for Month | +---------------+-----------------+-----------------+ | Boston | Ricky Adams | 2472 | | Boston | Morgan Miller | 1600 | | Boston | Morgan Miller | 4072 | | New Orleans | Marie Dyer | 1750 | | New Orleans | Tom Smith | 6407 | | New Orleans | Simone Caporale | 5722 | | New Orleans | Simone Caporale | 13879 | | San Francisco | Geoffrey Dyer | 500 | | San Francisco | Kenneth Dyer | 500 | | San Francisco | Kenneth Dyer | 1000 | | NULL | Kenneth Dyer | 18951 | +---------------+-----------------+-----------------+
This statement groups and adds up the total for each sales representative. When there aren’t any more sales representatives for a branch, a row in the display for the subtotal is generated. It displays the branch name and the name of the last representative. When there are no more branches, a row for the grand total of sales is generated. The branch shows NULL. For clarity, I’ve boldfaced the subtotals and the grand total in the results set.
SELECT [flags] {*|column|expression}[, ...] FROMtable[, ...][WHEREcondition] [GROUP BYcondition] [HAVINGcondition] [other clauses] [options]
The HAVING clause is similar to the WHERE clause, but
it is used for conditions returned by aggregate functions (e.g.,
AVG(), MIN(), and
MAX()). For older versions of MySQL, you must
use aliases for aggregate functions in the main clause of the
SELECT statement. Here is an example of how you can
use this clause:
SELECT CONCAT(name_first, ' ', name_last) AS 'Name', total_order FROM orders JOIN employees ON sales_rep = emp_id JOIN branches USING (branch_id) WHERE location = 'New Orleans' GROUP BY sales_rep HAVING MAX(total_order);
This SQL statement retrieves from the
employees table a list of employee names for all
employees located in the New Orleans branch office. From this list,
the statement refines the results by grouping the data for each
representative together and determines the sum of each one’s
total_order column. Because of the
MAX() function, it displays data only for the
row with the maximum number. The JOIN clause
is described in its own section earlier in this chapter.
SELECT [flags] {*|column|expression}[, ...] FROMtable[, ...] [WHEREcondition] [ORDER BY {column|expression|position} [ASC|DESC], ...] [otherclauses] [options]
The results of a SELECT statement, by
default, are displayed in the order in which the rows of data are
found in the table, which may be the order in which they were entered
into the table. To change the order of a results set, use the
ORDER BY clause. As a basis for ordering the
results, list one or more columns separated by commas. The order in
which columns are listed is the order in which sorts will be
conducted. You can also use aliases for columns, column combinations,
or expressions that were established earlier in the same
SELECT statement. Instead of stating a column’s
name, you can also state its position, where a value of 1 represents
the first column in the table. Here is an example of a
SELECT statement using the ORDER
BY clause:
SELECT CONCAT(name_first, ' ', name_last) AS Name, MONTH(birth_date) AS 'Birth Month', email_address FROM employees ORDER BY 'Birth Month' ASC, Name ASC;
Here a list of employees, the months in which they were born,
and their email addresses are extracted. For the name, the
CONCAT() function is used to put the first
and last name together, separated by a space. The
AS clause establishes an alias of
Name. The MONTH() function is used to
extract the month from the birth_date column, and
the AS clause sets up the alias Birth Month. In
the ORDER BY clause, the alias for the birth date
is used for the initial sort and the name is used for the secondary
sort. The result will be that all of the employees who have a birth
date in the same month will be listed together and in alphabetical
order by name. Both aliases are followed by the ASC
keyword to indicate that the results should be sorted in ascending
order. This is unnecessary, as ascending order is the default.
However, to change an ordering method to descending, use the
DESC keyword.
You can also order the results using expressions, which may be based on columns or aliases.
Here is an example of a SELECT statement using an
expression for ordering:
SELECT CONCAT(name_first, ' ', name_last) AS name, pay_rate, hours FROM employees ORDER BY pay_rate * hours DESC;
In this example, the first and last names are selected and
concatenated together under the name column heading
in the results set. The pay_rate column lists the
hourly dollar rate an employee is paid, and the
hours column contains the typical number of hours a
week that an employee works. In the ORDER BY
clause, the product of the hourly pay rate and the number of hours is
determined for the ordering of the results set. The rows are to be
listed in descending order per the DESC keyword
based on the expression.
SELECT [flags] {*|column|expression}[, ...] FROMtable[, ...] [WHEREcondition] [other clauses] [LIMIT {[offset,]count|countOFFSEToffset}] [PROCEDUREprocedure(arguments)] [FOR UPDATE|LOCK IN SHARE MODE]] [otherclauses] [options]
The LIMIT clause is used to limit the number of rows displayed by the
SELECT statement. The most straightforward method
of limiting the number of rows is to specify the maximum row count to be displayed, like
this:
SELECT * FROM employees LIMIT 5;
To begin listing rows after a specific number of records, an
offset may be given. The offset for the first row is 0. Two formats
accomplish this. One gives the amount of the offset, followed by a
comma and then the maximum count of rows to display. The other syntax
structure specifies the count, followed by the
OFFSET keyword, followed by the amount of the
offset. Here is an example of the first structure, which is
preferred:
SELECT * FROM employees LIMIT 10, 5;
In this example, after the 10th record is reached, the next 5
records will be displayed—in other words, results 11 through 15 are
returned. The offset and count for the LIMIT clause
are based on the rows in the results set, not necessarily on the rows
in the tables. So the amount of the offset is related to the order of
the rows retrieved from the tables based on clauses, such as the
WHERE clause and the ORDER BY
clause. See the description of the LIMIT clause
earlier in this chapter for more details.
SELECT [flags] {*|column|expression}[, ...] FROMtable[, ...] [WHEREcondition] [otherclauses] [PROCEDUREprocedure(arguments)] [LOCK IN SHARE MODE|FOR UPDATE]
To send the results of a SELECT statement as
standard input to a procedure, use the PROCEDURE clause. The
PROCEDURE keyword is followed by the name of the
procedure, which can be followed by parentheses containing parameters
to be passed to the procedure. Here is an example:
SELECT * FROM employees PROCEDURE ANALYSE(10, 225);
In this statement, the results of the SELECT
statement are sent to the built-in function ANALYSE() along with
two numeric parameters. See ANALYSE() near the end
of this chapter for more information on this function.
To lock the rows that are being selected from a table, LOCK IN SHARE MODE may be
given at the end of the SELECT statement. This
prevents other clients from changing the data while the
SELECT statement is running. The FOR
UPDATE option instructs MySQL to invoke a temporary write
lock on the rows being selected. Both of these locks will be
terminated when the statement is finished running.