SQL query parse parse tree CS 525: Advanced Database convert answer Organisation logical query plan execute apply laws 08: Query Processing statistics Pi “ improved ” l.q.p Parsing and Analysis pick best estimate result sizes {(P1,C1),(P2,C2)...} Boris Glavic l.q.p. +sizes estimate costs consider physical plans Slides: adapted from a course taught by Hector Garcia-Molina, Stanford InfoLab {P1,P2,…..} CS 525 Notes 8 - Parsing and Analysis 1 CS 525 Notes 8 - Parsing and Analysis 2 Parsing, Analysis, Conversion Analysis and Conversion 1. Parsing – Transform SQL text into syntax tree • Usually intertwined 2. Analysis • The internal representation is used to – Check for semantic correctness store analysis information – Use database catalog • Create an initial representation and – E.g., unfold views, lookup functions and attributes, check scopes complete during analysis 3. Conversion – Transform into internal representation – Relational algebra or QBM CS 525 Notes 8 - Parsing and Analysis CS 525 Notes 8 - Parsing and Analysis 3 4 Parsing, Analysis, Conversion Parsing 1. Parsing 2. Analysis • SQL -> Parse Tree 3. Conversion • Covered in compiler courses and books • Here only short overview CS 525 Notes 8 - Parsing and Analysis 5 CS 525 Notes 8 - Parsing and Analysis 6 1
Example: SQL query SQL Standard SELECT title � FROM StarsIn � • Standardized language WHERE starName IN ( � – 86, 89, 92, 99, 03, 06, 08, 11 � � SELECT name � • DBMS vendors developed their own � � FROM MovieStar � � � WHERE birthdate LIKE ‘ %1960 ’ � dialects ); � (Find the movies with stars born in 1960) CS 525 Notes 8 - Parsing and Analysis 7 CS 525 Notes 8 - Parsing and Analysis 8 Example: Parse Tree SQL Query Structure <Query> <Query Block> • Organized in Query blocks SELECT <SelList> FROM <FromList> WHERE <Condition> SELECT <select_list> � <Attribute> <RelName> <Tuple> IN <Query> FROM <from_list> � title StarsIn <Attribute> ( <Query> ) WHERE <where_condition> � starName <Query Block> GROUP BY <group_by_expressions> � SELECT <SelList> FROM <FromList> WHERE <Condition> HAVING <having_condition> � <Attribute> <RelName> <Attribute> LIKE <Pattern> ORDER BY <order_by_expressions> � name MovieStar birthDate ‘ %1960 ’ CS 525 Notes 8 - Parsing and Analysis CS 525 Notes 8 - Parsing and Analysis 9 10 Query Blocks SELECT clause • Only SELECT clause is mandatory • List of expressions and optional name assignment + optional DISTINCT – Some DBMS require FROM – Attribute references: R.a, b – Constants: 1, ‘hello’, ‘2008-01-20’ SELECT (1 + 2) AS result – Operators: (R.a + 3) * 2 result – Functions (maybe UDF): substr(R.a, 1,3) 3 • Single result or set functions – Renaming: (R.a + 2) AS x CS 525 Notes 8 - Parsing and Analysis 11 CS 525 Notes 8 - Parsing and Analysis 12 2
� SELECT clause - example SELECT clause – set functions • Function extrChar(string) � SELECT substring(p.name,1,1) AS initial � p.name � result FROM person p � n SELECT extrChar(p.name) AS n � J FROM person p � o � � e person result person J initial name name gender name gender i J Joe Joe male Joe male m Jim male J Jim Jim male CS 525 Notes 8 - Parsing and Analysis 13 CS 525 Notes 8 - Parsing and Analysis 14 SELECT clause – DISTINCT FROM clause • List of table expressions SELECT DISTINCT gender � – Access to relations FROM person p � – Subqueries (need alias) – Join expressions – Table functions result person – Renaming of relations and columns gender name gender Joe male male Jim male CS 525 Notes 8 - Parsing and Analysis CS 525 Notes 8 - Parsing and Analysis 15 16 FROM clause examples FROM clause examples FROM R � FROM R x(c,d) � -access table R FROM R AS x(c,d) � -using aliases x for R and c,d for its attribues FROM R, S � -access tables R and S FROM (R JOIN S t ON (R.a = t.b)), T � FROM R JOIN S ON (R.a = S.b) � -join R and S, and access T -join tables R and S on condition (R.a = S.b) FROM (R JOIN S ON (R.a = S.b)) JOIN T � FROM R x � -join tables R and S and result with T FROM R AS x � FROM create_sequence(1,100) AS seq(a) � -Access table R and assign alias ‘x’ -call table function CS 525 Notes 8 - Parsing and Analysis 17 CS 525 Notes 8 - Parsing and Analysis 18 3
FROM clause examples FROM clause examples FROM � SELECT * � � (SELECT count(*) FROM employee) FROM create_sequence(1,3) AS seq(a) � � AS empcnt(cnt) � result a -count number of employee in subquery 1 2 3 CS 525 Notes 8 - Parsing and Analysis 19 CS 525 Notes 8 - Parsing and Analysis 20 FROM clause examples FROM clause - correlation SELECT dep, headcnt � • Correlation FROM (SELECT count(*) AS headcnt, dep � � FROM employee � – Reference attributes from other FROM � GROUP BY dep) � clause item WHERE headcnt > 100 � – Attributes of i th entry only available in j > I result employee – Semantics: dep headcnt name dep IT 103 • For each row in result of i th entry: Joe IT Support 2506 • Substitute correlated attributes with value from Jim Marketing … … current row and evaluate query … … CS 525 Notes 8 - Parsing and Analysis CS 525 Notes 8 - Parsing and Analysis 21 22 Correlation - Example Correlation - Example SELECT name, chr � SELECT name � FROM employee AS e, � FROM (SELECT max(salary) maxsal � � extrChar(e.name) AS c(chr) � � FROM employee) AS m, � result � (SELECT name � � FROM employee x � name chr Joe J � WHERE x.salary = m.salary) AS e � employee Joe o name dep employee Joe e result Joe IT Jim J name salary Jim Marketing name Jim i Joe 20,000 Jim … … … … Jim 30,000 … … CS 525 Notes 8 - Parsing and Analysis 23 CS 525 Notes 8 - Parsing and Analysis 24 4
� WHERE clause WHERE clause examples • A condition WHERE R.a = 3 � -comparison between attribute and constant – Attribute references WHERE (R.a > 5) AND (R.a < 10) � – Constants -range query using boolean AND – Operators (boolean) WHERE R.a = S.b � – Functions -comparison between two attributes – Nested subquery expressions WHERE (R.a * 2) > (S.b – 3) � -using operators • Result has to be boolean CS 525 Notes 8 - Parsing and Analysis 25 CS 525 Notes 8 - Parsing and Analysis 26 Nested Subqueries Nested Subqueries Semantics • Nesting a query within an expression • For each tuple produced by the FROM clause execute the subquery • Correlation allowed – If correlated attributes replace them with – Access FROM clause attributes tuple values • Different types of nesting – Scalar subquery – Existential quantification – Universal quantification CS 525 Notes 8 - Parsing and Analysis CS 525 Notes 8 - Parsing and Analysis 27 28 Scalar subquery Existential Quantification • Subquery that returns one result tuple • <expr> IN <subquery> � – How to check? – Evaluates to true if <expr> equals at least one of the results of the subquery – -> Runtime error SELECT * � SELECT * � FROM R � FROM users � WHERE R.a = (SELECT count(*) FROM S) � WHERE name IN (SELECT name FROM � blacklist) � CS 525 Notes 8 - Parsing and Analysis 29 CS 525 Notes 8 - Parsing and Analysis 30 5
Existential Quantification Existential Quantification • EXISTS <subquery> � • <expr> <op> ANY <subquery> � – Evaluates to true if <subquery> returns at – Evaluates to true if <expr> <op> <tuple> least one tuple evaluates to true for at least one result tuple – Op is any comparison operator: =, <, >, … SELECT * � FROM users u � SELECT * � WHERE EXISTS (SELECT * FROM � FROM users � WHERE name = ANY (SELECT name FROM � blacklist � WHERE b.name = u.name) � blacklist) � CS 525 Notes 8 - Parsing and Analysis 31 CS 525 Notes 8 - Parsing and Analysis 32 Universal Quantification Nested Subqueries Example • <expr> <op> ALL <subquery> � SELECT dep,name � FROM employee e � – Evaluates to true if <expr> <op> <tuple> WHERE salary >= ALL (SELECT salary � evaluates to true for all result tuples FROM employee d � employee – Op is any comparison operator: =, <, >, … � WHERE e.dep = d.dep) � name dep salary SELECT * � Joe IT 2000 result FROM nation � Jim IT 300 dep Name WHERE nname = ALL (SELECT nation FROM � Bob HR 100 IT Joe blacklist) � Alice HR 10000 HR Alice Patrice HR 10000 HR Patrice CS 525 Notes 8 - Parsing and Analysis CS 525 Notes 8 - Parsing and Analysis 33 34 GROUP BY clause GROUP BY restrictions • A list of expressions • If group-by is used then – Same as WHERE – SELECT clause can only use group by – No restriction to boolean expressions or aggregation functions – DBMS has to know how to compare = for data type • Results are grouped by values of the expressions • -> usually used for aggregation CS 525 Notes 8 - Parsing and Analysis 35 CS 525 Notes 8 - Parsing and Analysis 36 6
Recommend
More recommend