M ATH IN SQL A GGREGATION O PERATORS Operators on sets of tuples. Significant extension of relational algebra. SUM ( [DISTINCT] A): the sum of all (unique) values in attribute A. AVG ( [DISTINCT] A): the average of all (unique) values in attribute A. SELECT AVG (S.age) FROM Sailors S; SELECT AVG ( DISTINCT S.age) FROM Sailors S WHERE S.rating=10; 2 A GGREGATION O PERATORS Operators on sets of tuples. Significant extension of relational algebra. MAX (A): the maximum value in attribute A. MIN (A): the minimum value in attribute A. SELECT MAX (rating) FROM Sailors; SELECT S.sname FROM Sailors S WHERE S.rating= ( SELECT MAX (S2.rating) FROM Sailors S2); 3 3 3
A GGREGATION O PERATORS Operators on sets of tuples. Significant extension of relational algebra. COUNT (*): the number of tuples. SELECT COUNT (*) FROM Sailors S 4 A GGREGATION O PERATORS Operators on sets of tuples. Significant extension of relational algebra. COUNT ( [DISTINCT] A): the number of (unique) values in attribute A. SELECT COUNT ( DISTINCT S.rating) FROM Sailors S WHERE S.sname=‘Bob’; 5 A GGREGATION O PERATORS Find name and age of the oldest sailor(s). SELECT S.sname, MAX (S.age) The first query looks FROM Sailors S; correct, but is illegal. Thoughts as to why? The second query is a SELECT S.sname, S.age correct and legal FROM Sailors S solution. WHERE S.age = ( SELECT MAX (S2.age) FROM Sailors S2); 6 6 6
GROUP BY AND HAVING So far, we’ve applied aggregation operatorsto all (qualifying) tuples. Sometimes, we want to apply them to each of several groups of tuples. Find the age of the youngest sailor for each rating value. Suppose we know that rating values go from 1 to 10; we can write ten (!) queries that look like this: For i = 1, 2, ... , 10: SELECT MIN (S.age) FROM Sailors S WHERE S.rating= i ; But in general, we don’t know how many rating values exist, and what these rating values are. Plus, it’s a waste of time to write so many queries 7 GROUP BY AND HAVING [DISTINCT] target-list SELECT relation-list FROM Notice the qualification WHERE notation GROUP BY grouping-list group-qualification HAVING A group is a set of tuples that have the same value for all attributes grouping-list . The target-list contains attribute names terms with aggregation operations. Attribute list must be a subset of grouping-list. Each answer tuple correspondsto a group, and output attributes must have a single value per group. 8 C ONCEPTUAL E VALUATION Given: SELECT S.rating, MIN(S.age) as minage FROM Sailors S WHERE S.age >= 18 GROUP BY S.rating HAVING COUNT(*) > 1 Step 1 The cross-product of relation-list is computed In this instance, it’s only Sailors 9 9 9
C ONCEPTUAL E VALUATION Given: SELECT S.rating, MIN(S.age) as minage FROM Sailors S WHERE S.age >= 18 GROUP BY S.rating HAVING COUNT(*) > 1 Step 2 Tuples that fail qualification are discarded ‘unnecessary’ attributes are deleted 10 10 10 C ONCEPTUAL E VALUATION Given: SELECT S.rating, MIN(S.age) as minage FROM Sailors S WHERE S.age >= 18 GROUP BY S.rating HAVING COUNT(*) > 1 Step 3 Remaining tuples are partitioned into groups by the value of attributes in grouping-list 11 11 11 C ONCEPTUAL E VALUATION Given: SELECT S.rating, MIN(S.age) as minage FROM Sailors S WHERE S.age >= 18 GROUP BY S.rating HAVING COUNT(*) > 1 Step 4 The group-qualification is then applied to eliminate groups that do not satisfy this condition. 12 12 12
C ONCEPTUAL E VALUATION Given: SELECT S.rating, MIN(S.age) as minage FROM Sailors S WHERE S.age >= 18 GROUP BY S.rating HAVING COUNT(*) > 1 Step 5 One answer tuple is generated per qualifying group by applying the aggregation operator. 13 13 13 GROUP BY AND HAVING Find the age of the youngest sailor with age 18, for each sid sname rating age rating with at least 2 such sailors. 22 dustin 7 45.0 SELECT S.rating, MIN 31 lubber 8 55.5 (S.age) 71 zorba 10 16.0 FROM Sailors S WHERE S.age >= 18 64 horatio 7 35.0 GROUP BY S.rating 29 brutus 1 33.0 HAVING COUNT (*) > 1; 58 rusty 10 35.0 Only S.rating and S.age are mentioned in the SELECT, rating age GROUP BY or HAVING clauses; 1 33.0 other attributes ` unnecessary ’. Answer relation 7 45.0 2nd column of result is rating 7 35.0 unnamed 7 35.0 8 55.5 What to do? 10 35.0 14 14 14 GROUP BY AND HAVING For each red boat, find the number of reservations for this boat. SELECT B.bid, COUNT (*) AS scount FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’ GROUP BY B.bid; Grouping over a join of three relations. What do we get if we remove B.color=‘red’ from the WHERE clause and add a HAVING clause with this condition? What if we drop Sailors and the condition involving S.sid? 15 15 15
GROUP BY AND HAVING Find the age of the youngest sailor with age > 18, for each rating with at least 2 sailors (of any age). SELECT S.rating, MIN (S.age) FROM Sailors S WHERE S.age > 18 GROUP BY S.rating HAVING 1 < (SELECT COUNT (*) FROM Sailors S2 WHERE S.rating=S2.rating); Shows HAVINGclause can also contain a subquery. What if HAVING clause is replaced by: HAVINGCOUNT(*) >1 16 16 16 GROUP BY AND HAVING Find those ratings for which the average age is the minimum over all ratings. Aggregation operations cannot be nested! WRONG: SELECT S.rating FROM Sailors S WHERE S.age = (SELECT MIN (AVG (S2.age)) FROM Sailors S2); Correctsolution: SELECT Temp.rating, Temp.avgage FROM (SELECT S.rating, AVG (S.age) AS avgage FROM Sailors S GROUP BY S.rating) AS Temp WHERE Temp.avgage = (SELECT MIN (Temp.avgage) FROM Temp); 17 17 17 ORDERING & TOP/BOTTOM
ORDER BY The ORDER BY keyword is used to sort the result-set by a specified column. The ORDER BY keyword sort the recordsin ascending order by default. If you want to sort the records in a descending order, you can use the DESC keyword. 19 19 19 TOP/BOTTOM The TOP clause is used to specify the number of records to return. The TOP clause can be very useful on large tables with thousands of records Returning a large number of records can impact on performance Can ‘sample’ the table using TOP Not all database systems support the TOP clause or implement it in different fashion 20 20 20 TOP/BOTTOM SQL Server SELECT TOP number|percent column_name(s) FROM table_name Ex: SELECT TOP 5 * FROM Persons MySQL SELECT column_name(s) FROM table_name LIMIT number Ex: SELECT * FROM Persons LIMIT 5 21 21 21
TOP/BOTTOM Oracle SELECT column_name(s) FROM table_name WHERE ROWNUM <= number Ex: SELECT * FROM Persons WHERE ROWNUM <=5 DB2 SELECT column_name(s) FROM table_name FETCH FIRST number ROWS ONL Y Ex: SELECT * FROM Persons FETCH FIRST 5 ROWS ONL Y 22 22 22 TOP/BOTTOM Can specify Fixed number SELECT TOP 10 * … A percent SELECT TOP 10 PERCENT * … 23 23 23 TOP/BOTTOM How to return the oldest 5 rentals? How to return the newest 5 rentals? 24 24 24
TOP/BOTTOM How to return the 3 rd newest rental? 25 25 25 S UMMARY SQL was an important factor in the early acceptance of the relational model; more natural than earlier, procedural query languages. All queries that can be expressed in relational algebra can also be formulated in SQL. In addition, SQL has significantly more expressive power than relational algebra, in particular aggregation operationsand grouping. Many alternative ways to write a query; query optimizer looks for most efficient evaluation plan. In practice, users need to be aware of how queries are optimized and evaluated for most efficient results. 26 26 26
Recommend
More recommend