< select statement: single row > ::= SELECT [ < set quantifier > ] < select list > INTO < select target list > < table expression > < select target list > ::= < target specification > [ { < comma > < target specification > }... ] < query expression > ::= < non-join query expression > | < joined table > < non-join query expression > ::= < non-join query term > | < query expression > UNION [ ALL ] [ < corresponding spec > ] < query term > | < query expression > EXCEPT [ ALL ] [ < corresponding spec > ] < query term > < non-join query term > ::= < non-join query primary > | < query term > INTERSECT [ ALL ] [ < corresponding spec > ] < query primary > < non-join query primary > ::= < simple table > | < left parent > < non-join query expression > < right parent > < simple table > ::= < query specification > | < table value constructor > | < explicit table > < query specification > ::= SELECT [ < set quantifier > ] < select list > < table expression > < select list > ::= < asterisk > | < select sublist > [ { < comma > < select sublist > }... ] < select sublist > ::= < derived column > | < qualifier > < period > < asterisk > < derived column > ::= < value expression > [ < as clause > ] < as clause > ::= [ AS ] < column name > < table expression > ::= < from clause > [ < where clause > ] [ < group by clause > ] [ < having clause > ] < from clause > ::= FROM < table reference > [ { < comma > < table reference > }... ] < table reference > ::= < table name > [ [ AS ] < correlation name > [ < left parent > < derived column list > < right parent > ] ] | < derived table > [ AS ] < correlation name > [ < left parent > < derived column list > < right parent > ] | < joined table > < derived column list > ::= < column name list > < derived table > ::= < table subquery > < table subquery > ::= < subquery > < joined table > ::= < cross join > | < qualified join > | < left parent > < joined table > < right parent > < cross join > ::= < table reference > CROSS JOIN < table reference > < qualified join > ::= < table reference > [ NATURAL ] [ < join type > ] JOIN < table reference > [ < join specification > ] < join type > ::= INNER | < outer join type > [ OUTER ] | UNION < outer join type > ::= LEFT | RIGHT | FULL < join specification > ::= < join condition > | < named columns join > < join condition > ::= ON < search condition > < named columns join > ::= USING < left parent > < join column list > < right parent > < join column list > ::= < column name list > < where clause > ::= WHERE < search condition > < group by clause > ::= GROUP BY < grouping column reference list > < grouping column reference list > ::= < grouping column reference > [ { < comma > < grouping column reference > }... ] < grouping column reference > ::= < column reference > [ < collate clause > ]
The SELECT statement is the principal means of information retrieval in SQL. A SELECT can retrieve information from one or more tables with arbitrary search criteria. SELECT's can also be nested to have the output of one serve as an input or search condition for another. Several SELECT's can be combined into one query with the UNION, INTERSECTION and EXCEPT operators.
The SELECT syntax consists of the following parts:
SELECT [DISTINCT] scalar_exp {, scalar_exp} FROM table {, table} WHERE < search condition > GROUP BY < column list > HAVING < search condition > ORDER BY < ordering spec list > FOR UPDATE
All parts are optional. If one or more of the clauses appear they must appear in the above order. All parts do not need to be specified, e.g. SELECT A FROM T FOR UPDATE is valid but SELECT A FROM T ORDER BY a WHERE < < 10 is not.
A select without a FROM clause is allowed.
This is useful for returning values of expressions to the client. Such a select always returns one row, with the values listed as columns. Typically only useful from interactive SQL.
Example:
select 1 + 2 as three;
A table reference in the FROM clause can either be a simple table name, another SELECT expression of the form described above or a join expression. A SELECT inside a FROM is called a derived table. This means that the rows selected by the derived table expression are treated as if they constituted a table. This is similar to a VIEW reference and a derived table can be thought of as an unnamed in-line VIEW declaration.
A join expression combines table references, which are either simple, derived or joined tables themselves into different joins.
A join is an operation that retrieves for each row of one table zero or more rows from another table. The join condition specifies how the rows are matched. The result of a join is a set of rows containing selected columns from both joined tables. Joins are by default so called INNER joins, which means that for a row to be in the result there must be a row matching the left table in the right table as specified by the join condition. An OUTER join is a join that will produce a result row even if there is no row in the right table for the row in the left table. The columns that would have come from the right table are then just set to NULL's.
table_ref ::= < table name > [< correlation name >] | / query expression ) < correlation name > | < table ref > < [NATURAL] join > < table ref > < join condition > join ::= < empty > | CROSS | INNER | LEFT [OUTER] | RIGHT [OUTER] join condition ::= < empty > | ON < search condition > | USING '(' < column > {, column} ')'
The < correlation name > is an identifier that is used to identify the table in a column reference if the same table appears many times in the query expression, e.g. is joined with itself.
The CROSS join has no join condition. This means that for each row in the left table all rows in the right table are included in the result.
Virtuoso supports the AS operator in the selection list of a SELECT statement. This notation allows declaring a name and optionally a type and precision for a statement's output column.
The syntax is:
as_exp: Scalar_exp AS NAME opt_data_type
For instance, one can write:
SELECT COUNT (*) AS NUMBEROFROWS FROM XX; SELECT COUNT (*) AS NUMBEROFROWS INTEGER (2) FROM XX; SELECT CONCATENATE (COL1, COL2) AS RESULTSTRING VARCHAR (50) FROM XX;
The AS expression tells the client to return specified values in the SQLDescribeCol, SQLColAttribute or equivalent calls. The semantics of statements is not otherwise affected. An AS expression can appear anywhere a scalar expression can but the only place where it has an effect is the selection list of a SELECT statement.
If a data type is given and contains a precision, that precision is returned to the client as the precision of the column in question.
The following three statements produce an identical result.
select Orders.OrderID, ProductID from Orders natural join Order_Details using (OrderID) select Orders.OrderID, ProductID from Orders join Order_Details on Orders.OrderID = Order_Details.OrderID select Orders.OrderID, ProductID from Orders, Order_Details where Orders.OrderID = Order_Details.OrderID
In all these cases if there exists no Order_Details row matching the Orders row there will no no result row corresponding to the Orders row. An outer join can can be used to also retrieve left table records for which there is no matching right table record.
select Orders.OrderID, ProductID from Orders natural left outer join Order_Details using (OrderID)
will produce a result identical to the above sample if for each Orders row there is at least one Order_Details row. If there is none however, the OrderID column from Orders will appear together with a NULL ProductID from the non-existent Order_Details.
A right outer join is like a left outer join with the left and right tables reversed.
The result rows of a query can be ordered based on their column values. The ORDER BY phrase allows specifying an ascending or descending sort order for a any column. The SQL interpreter will use an index if there is an index whose order reflects the order in the ORDER BY clause. If there is no appropriate index or if ascending and descending order is combined for columns of the same table the SQL interpreter will first evaluate the query and then sort the results before returning them.
Optimizations below for more information.
select * from Employees order by BirthDate;
will list all employees, oldest first, in ascending order of birth date.
The GROUP BY clause allows computing functions over repeating groups. Without the GROUP by clause set functions (AVG, MIN, MAX, SUM, COUNT) may not be mixed with normal columns in a selection list. If set functions and columns are mixed, all the columns must appear in the GROUP BY section. Such a query will produce as many rows as there are distinct value combinations of the grouping columns. The set functions will be computed for each distinct column combination.
select OrderID, sum (UnitPrice * Quantity) from Order_Details group by OrderID having sum (UnitPrice * Quantity) > 5000 order by 2 desc;
Produces the OrderID and total value of the order in decreasing order of order value. The HAVING clause specifies that only orders with a value > 5000 will be counted. Note that the sum expression in having must be written identically to the same expression in the SELECT left.
The 2 in the order by refers to the second column of the select, which has no name, it being a function reference.
Virtuoso database offers the tool which increase efficiency of SQL summary queries and simplify such operations. The options ROLLUP and CUBE in the GROUP BY allow creating more comprehensive summary operations. The result of CUBE and ROLLUP are result sets which could be produced in other way only with additional coding and queries.
The ROLLUP and CUBE extends the result set of GROUP BY. The ROLLUP builds consequence of subtotal aggregates on every queried level including the grand total. The CUBE is an extension of ROLLUP. It builds all possible subtotal aggregates combination for given GROUP BY.
Here are the examples of ROLLUP and CUBE usage:
select j, grouping (j), k, grouping (k), t, grouping (t), sum (i) from TABLE1 group by rollup (j,k,t); select j, grouping (j), k, grouping (k), t, grouping (t), sum (i) from ROLLUP1 group by cube (j,k,t);
the grouping(c1) procedure returns "1" if the column "c1" is not in the dynamic GROUP BY set, and returns "0" otherwise.
The result set of:
select j, k, t, sum (i) from TABLE1 group by rollup (j,k,t) ;
is equivalent of cumulate result sets of:
select j, k, t, sum (i) from TABLE1 group by j,k,t; select NULL, k, t, sum (i) from TABLE1 group by k,t; select NULL, NULL, t, sum (i) from TABLE1 group t; select NULL, NULL, NULL, sum (i) from TABLE1;
The result set of:
select t,s, sum (i) from TABLE1 group by cube (t,s);
is equivalent of cumulate result set of:
select t,s, sum (i) from TABLE1 group by t,s; select t,NULL, sum (i) from TABLE1 group by t; select s,NULL, sum (i) from TABLE1 group by s; select NULL,NULL, sum (i) from TABLE1;
A SELECT expression may be used in the place of a table in a FROM clause. This provides control over where DISTINCT and ORDER BY operations are evaluated.
select ProductName, UnitsInStock from (select distinct ProductID from Order_Details) O, Products where Products.ProductID = O.ProductID;
This retrieves the name and quantity of products that have been ordered.
An equivalent phrasing would be
select distinct ProductName, UnitsInStock from Order_Details O, Products where Products.ProductID = O.ProductID;
The difference is that the latter retrieves a Products row for each order line whereas as the first retrieves a products row for each distinct product in the order lines. The first is therefore faster to evaluate. Also note that the rows in the DISTINCT buffer in the first example only consist of the product id whereas they are much longer in the second example.
Note that a correlation name is required for derived tables since the derived table is as such anonymous.
< non-join query expression > ::= < non-join query term > | < query expression > UNION [ ALL ] [ < corresponding spec > ] < query term > | < query expression > EXCEPT [ ALL ] [ < corresponding spec > ] < query term > < corresponding spec > ::= CORRESPONDING [ BY < left parent > < corresponding column list > < right parent > ]
Queries can be combined by set operators UNION, INTERSECTION and EXCEPT (set difference). The ALL keyword will allow duplicate rows in the result set. The CORRESPONDING BY clause allows specifying which columns will be used to determine the equality of rows from the left and right operands.
select OrderID from Orders except corresponding by (OrderID) select OrderID from Order_Details
will produce the OrderID's of orders that have no Order_Details. This is equivalent to: select OrderID from Orders a where not exists (select 1 from Order_Details b where a.OrderID = b.OrderID)
Note that the queries, although to a similar effect are executed quite differently. There may be significant differences in performance.
The LIKE predicate expects a pattern to be applied to a varchar or nvarchar column to qualify the results to be returned from a query.
If the pattern does not begin with an at-sign (@) or with two asterisks (**), then we test the equality of the string and pattern with ordinary wildcard matching, which behaves approximately like the filename pattern matching in the Unix shell. (But not like the regular expression matching in utilities like grep and sed).
The following characters have special significance in the pattern:
[ ] (Called a group-expression here) Matches any one of the enclosed characters, unless the first character following the opening [ is ^, then matches only if the character (in the datum string) is not any one of those specified after the ^. (i.e. the ^ negates the meaning of this expression.)
You can use character ranges like 0-9 (shorthand for 0123456789) inside the brackets, in which case the character in the datum string must be lexically within the inclusive range of that pair (of course the character at the left side of hyphen must be lexically (that is, its ASCII value) less than the character at the right side).
The hyphen can be included in the character set by putting it as the first or last character. The right bracket (]) can be included by putting it as the first character in the expression, i.e. immediately after the opening bracket ([) or the caret (^) following it.
[abc] Matches any of the letters a, b and c. [^0123456789] Matches anything, except digits. (same as [^0-9]) [[] Matches [ []] Matches ] [][] Matches ] and [ [^]] Matches anything except ] [A-Za-z0-9] Matches all the alphanumeric characters. [-*+/] Matches the four basic arithmetic operators. [-] Matches to single hyphen. []-] Matches to ] or - [-[] or [[-] Matches to - or [
That is, the hyphen indicates a range between characters, unless it is the first or the last character in the group expression, in which case it matches just to itself.
@ Matches the character last matched to ? or group-expression. For example ?*@ matches to all strings which begin with the same character they end. However, if there is neither ? nor [] expression at the left side of @ in the pattern, then @ matches just to itself. (e.g. *@* should match to all e-mail addresses).
Any other characters match ONLY to themselves, that is, not even to the upper- or lowercase variants of the same letter. Use expression like [Wo][Oo][Rr][Dd] if you want to find any mixed-case variant of the word "word", or use the substring search explained below.
However, if the pattern begins with an at-sign (@) then we compare the rest of pattern to string with the fuzzy matching, allowing differences of few characters in quality and quantity (length). If there is more than one @ in the beginning of pattern they are all skipped, and so many additional liberties are given for the match function. The more @-signs there are in the beginning, the more fuzzy (liberal) is the search. For example: pattern "@Johnson" will match to string "Jonsson" and pattern "@@Johnson" will match also to "Jansson".
If the pattern begins with two asterisks, then we do diacritic- and case insensitive substring search, trying to find the string given in the rest of pattern from the datum string.
"**escort" will match to "Ford Escort vm. 1975".
If there are any ISO8859.1 diacritic letters (e.g. vowels with accents or umlaut-signs, or letters like the Spanish n with ~ (tilde)) present in the datum string, then the plain unaccented (7-bit ASCII) variant of the same letter in the pattern string will match to it. But if there are any diacritic letter specified in the pattern string, then it will match only to the upper- or lowercase variant of exactly the same diacritic letter.
The rationale behind this is that the people entering the information to database can use the exact spelling for the word, for example writing the word "Citroen" with the umlaut-e (e with two dots above it), as it is actually written in French, and the people who search for the Citroens can still find it without need to remember the exact orthography of the French, by just giving a word "citroen". And this allows also the people who have just plain 7-bit ASCII keyboards to search for the words like Ra"a"kkyla" (place in Finland, a" means umlaut-a, i.e. a with two dots above it), just by entering the word raakkyla.
So the following holds with the substring searches:
because the internal matching functions use macros which consider also the characters like: @, [, \, ], and ^ to be letters, they will match against characters `, {, |, }, and ~ respectively, which is just all right, because in some older implementations of European character sets those characters mark the uppercase and lowercase variants of certain diacritic letters.
It is generally better to match too liberally and so maybe sometimes give something entirely off the wall to the user, than to miss something important because of too strict criteria.
Of course, when searching from the data which contains text in some wide-character format (like certain coding systems for Japanese and Chinese where one character is coded with two bytes) neither fuzzy matching function nor nc_strstr function presented here should be used, as they would often match on entirely spurious cases.
query_term : SELECT opt_top selection .... opt_top : opt_all_distinct [ TOP INTNUM ] | opt_all_distinct [ TOP SKIPINTNUM, INTNUM ] | opt_all_distinct [ TOP (num_scalar_exp) ] | opt_all_distinct [ TOP (skip_num_scalar_exp, num_scalar_exp) ] opt_all_distinct : [ ALL | DISTINCT ]
The TOP n phrase can follow an optional ALL or DISTINCT keyword in a SELECT, either at top level or inside a query term of an expression or subquery. The query with the TOP option will generate at most the indicated number of result rows.
The scope of TOP is the query in question. For example
select top 3 row_no from t1 best union select top 3 row_no + 1 from t1;
Will always return 4 rows assuming there are at least 3 rows in T1.
The optional SKIPINTNUM parameter lets you offset the selection by SKIPINTNUM number of rows. If you have a ten-row table and select top 2 from this_table you get the first two rows, select top 2, 2 from this_table will return the third and fourth rows only, instead.
There are many situations where you might find it useful to alter the the data returned by a SQL query based on a few rules. For example, you may want to display Customers gender as 'Male' or 'Female' based on whether their title is 'Mr' or one of 'Miss', 'Mrs' or 'Ms'. The CASE expression can easily accommodate this.
The Syntax of CASE is:
CASE WHEN <search-condition> THEN <output> WHEN <search-condition> THEN <output> ... ELSE <output> END
When a <search-condition> is met the corresponding <output> is returned. If no conditions are met then the <output> after is ELSE is returned as a default value.
SELECT Title, CustomerName, CASE WHEN Title = 'Mr' THEN 'Male' WHEN Title = 'Mrs' THEN 'Female' WHEN Title = 'Miss' THEN Female' WHEN Title = 'Ms' THEN 'Female' ELSE 'Unknown' END as Gender, Company FROM Customers
May return values such as:
Title Gender CustomerName Company VARCHAR VARCHAR VARCHAR VARCHAR ___________________________________________________________ Mr Male Thomas Hardy Around the Horn Miss Female Christina Berglund Berglunds shop Mrs Female Hanna Moos Blauer See Delikatessen Mr Male Laurence Lebihan Bon app
There is also a short hand notation for the CASE expression as follows:
CASE <search-parameter> WHEN <search-value> THEN <output> WHEN <search-value> THEN <output> ... ELSE <output> END
This short hand is best demonstrated by the rewrite of the above example as follows:
SELECT Title, CustomerName, CASE Title WHEN 'Mr' THEN 'Male' WHEN 'Mrs' THEN 'Female' WHEN 'Miss' THEN Female' WHEN 'Ms' THEN 'Female' ELSE 'Unknown' END as Gender, Company FROM Customers
In both cases the ELSE keyword is optional. If ELSE is unspecified then ELSE NULL is implicit.
The NULLIF expression is a short hand implementation of a special case of the CASE expression for a popular demand. Consider the following CASE expression:
CASE col1 WHEN 'something' THEN NULL ELSE col1 END
This is replaced by the NULLIF expression which achieves the same result using the following, much shorter expression:
NULLIF (col1, 'something')
This is often useful in situations where you have a code to denote a value as unspecified for whatever reason, but in many applications you would rather this was NULL.
The COALESCE expression is another application of the CASE expression to suit another frequent requirement. The syntax of COALESCE is as follows:
COALESCE (value-1, value-2, ..., value-n)
COALESCE returns the first non-NULL parameter. This is equivalent to
CASE WHEN value-1 IS NOT NULL THEN value-1 WHEN value-2 IS NOT NULL THEN value-2 ... ELSE value-n END
SQL has always been considered a strongly typed language, meaning that you cannot have expressions that contain arbitrary data types. Casting is invaluable for comparing values that are obviously compatible but their data types are not, such as 1 = '1'. This attempts compares an integer with a char which would not work unless one of the values was cast as follows:
cast('1' as integer) = 1
The CASTING section for more information.
Virtuoso extends the select statement with a breakup option. This option allows a single row selected by a derived table to be seen as multiple rows by the enclosing query.
This is specially useful when translating relation tables to RDF. Using breakup, one can do a single pass over a table and generate multiple triples, each presented as a separate result row.
The syntax is:
SELECT BREAKUP breakup_term [, ...] FROM .... breakup_term ::= scalar_exp [, scalar_exp...] [WHERE search_condition]
Each breakup term is a list of comma separated expressions with an optional search condition at the end. Each list is treated as a select list in a union, i.e. they must be of equal length and the leftmost list must provide a name for each column. This means that an AS declaration is needed if the expression is not a column.
If a breakup tern has the optional WHERE clause, the condition is evaluated in the scope of the select, i.e. all that is defined by the FROM. If the condition is true, the row represented by the breakup term is added to the result set of the breakup select, otherwise it is ignored.
A breakup select is only allowed in a derived table or a union or other set operator term inside a derived table. A top level breakup select is not allowed. To have a breakup select as the topmost select, simply write:
select * from (select breakup .... from ...) f;
Breakup cannot be mixed with distinct, top, group by or order by. Again, to combine these, use nested derived tables.
Breakup operates equally well on local and remote tables. Breakup is never passed on to a remote but the FROM of a breakup select can consist of tables from any source.
select * from (select breakup (a.row_no, b.fi2) (b.row_no, a.fi3 where a.fi3 is not null) from r1..t1 a, r1..t1 b where a.row_no < 10 and b.row_no = a.row_no)f;
This produces 2 rows for each result of the join, except if fi3 is null, in which case only the first term of the breakup is returned in the result set.
Previous
UPDATE Statement |
Chapter Contents |
Next
COMMIT WORK, ROLLBACK WORK Statement |