review case where index is useful
play

Review: Case where index is useful CS5208: Query Optimization 2 1 - PDF document

Query Optimization in Relational Database Systems It is safer to accept any chance that offers itself, and extemporize a procedure to fit it, than to get a g good plan matured, and wait p , for a chance of using it. Thomas Hardy (1874) in


  1. Query Optimization in Relational Database Systems It is safer to accept any chance that offers itself, and extemporize a procedure to fit it, than to get a g good plan matured, and wait p , for a chance of using it. Thomas Hardy (1874) in Far from the Madding Crowd CS5208: Query Optimization 1 Review: Case where index is useful CS5208: Query Optimization 2 1

  2. Query Optimization • Since each relational op returns a relation, ops can be composed ! p • Queries that require multiple ops to be composed may be composed in different ways - thus optimization is necessary for good performance, e.g. A B C D can be evaluated as follows: • (((A B) C) D) • ((A B) (C D)) • ((B A) (D C)) • … CS5208: Query Optimization 3 Query Optimization • Each strategy can be represented as a query evaluation plan (QEP) - Tree of R.A. ops, with choice of algo for each op. of algo for each op SM HJ NL D NL HJ INL C A B A B C D • Goal of optimization: To find the “best” plan that compute the same answer (to avoid “bad” plans) CS5208: Query Optimization 4 2

  3. More on Motivating Examples Sailors ( sid : integer sname : string rating : integer age : real) Sailors ( sid : integer, sname : string, rating : integer, age : real) Reserves ( sid : integer, bid : integer, day : dates, rname : string) • Reserves: • Each tuple is 40 bytes long, 100 tuples per page, 1000 pages. • Sailors: • Each tuple is 50 bytes long, 80 tuples per page, 500 pages. CS5208: Query Optimization 5 SELECT S.sname FROM Reserves R, Sailors S Example WHERE R.sid=S.sid AND R.bid=100 AND S.rating>5 sname sname sname rating > 5 rating > 5 bid=100 sid=sid sid=sid rating > 5 bid=100 bid=100 Sailors sid=sid Reserves Sailors Reserves Reserves Sailors CS5208: Query Optimization 3

  4. SELECT S.sname FROM Reserves R, Sailors S Example WHERE R.sid=S.sid AND R.bid=100 AND S.rating>5 sname lubber sname sid sname rating age bid day rname rating > 5 bid=100 31 lubber 8 55.5 100 10/11/96 lubber sid sname rating age bid day rname 31 31 l bb lubber 8 8 55 5 55.5 100 100 10/11/96 10/11/96 lubber l bb sid=sid 58 rusty 10 35.0 103 11/12/96 dustin sid sname rating age sid bid day rname Reserves Sailors 22 dustin 7 45.0 31 100 10/11/96 lubber 28 yuppy 9 35.0 58 103 11/12/96 dustin 31 lubber 8 55.5 44 guppy 5 35.0 CS5208: Query Optimization 58 rusty 10 35.0 SELECT S.sname FROM Reserves R, Sailors S Example WHERE R.sid=S.sid AND R.bid=100 AND S.rating>5 sname lubber sname sid sname rating age bid day rname 31 lubber 8 55.5 100 10/11/96 lubber rating > 5 sid sname rating age bid day rname sid=sid 31 lubber 8 55.5 100 10/11/96 lubber bid=100 Sailors sid bid day rname sid sname rating age 22 dustin 7 45.0 31 100 10/11/96 lubber 28 yuppy 9 35.0 Reserves 31 lubber 8 55.5 sid bid day rname 44 guppy 5 35.0 31 100 10/11/96 lubber 58 rusty 10 35.0 58 103 11/12/96 dustin CS5208: Query Optimization 4

  5. SELECT S.sname FROM Reserves R, Sailors S Example WHERE R.sid=S.sid AND R.bid=100 AND S.rating>5 sname sname lubber sid sname rating age bid day rname sname 31 lubber 8 55.5 100 10/11/96 lubber sid bid day rname sid sname rating age 22 dustin 7 45.0 31 100 10/11/96 lubber sid=sid 28 yuppy 9 35.0 31 31 lubber lubber 8 8 55.5 55 5 58 rusty 10 35.0 rating > 5 bid=100 sid sname rating age Reserves Sailors 22 dustin 7 45.0 sid bid day rname 28 yuppy 9 35.0 31 lubber 8 55.5 31 100 10/11/96 lubber 44 guppy 5 35.0 58 103 11/12/96 dustin 58 rusty 10 35.0 CS5208: Query Optimization SELECT S.sname FROM Reserves R, Sailors S Example (Cont) WHERE R.sid=S.sid AND R.bid=100 AND S.rating>5 Query Evaluation Plan: • Cost? C t? (On-the-fly) sname (On-the-fly) rating > 5 bid=100 (Page Nested Loops) sid=sid Reserves Sailors CS5208: Query Optimization 10 5

  6. SELECT S.sname FROM Reserves R, Sailors S Example (Cont) WHERE R.sid=S.sid AND R.bid=100 AND S.rating>5 Query Evaluation Plan: • Cost: 500+500*1000 I/Os C t 500 500*1000 I/O (On-the-fly) sname (On-the-fly) rating > 5 bid=100 (Page Nested Loops) sid=sid Reserves Sailors CS5208: Query Optimization 11 SELECT S.sname FROM Reserves R, Sailors S Example (Cont) WHERE R.sid=S.sid AND R.bid=100 AND S.rating>5 Query Evaluation Plan: • Cost: 500+500*1000 I/Os C t 500 500*1000 I/O (On-the-fly) sname • Memory? (On-the-fly) rating > 5 bid=100 (Page Nested Loops) sid=sid Reserves Sailors CS5208: Query Optimization 12 6

  7. SELECT S.sname FROM Reserves R, Sailors S Example (Cont) WHERE R.sid=S.sid AND R.bid=100 AND S.rating>5 Query Evaluation Plan: • Cost: 500+500*1000 I/Os C t 500 500*1000 I/O (On-the-fly) sname • Memory: 3 (On-the-fly) rating > 5 bid=100 (Page Nested Loops) sid=sid Reserves Sailors CS5208: Query Optimization 13 Alternative Plans 1 (No Indexes) • Main difference: push selections down • Assume 5 buffers T1 = 10 pages (100 boats • Assume 5 buffers, T1 = 10 pages (100 boats, uniform distribution), T2 = 250 pages (10 ratings, sname (On-the-fly) uniform distribution) (Sort-Merge) sid=sid (T1) (T2) rating > 5 bid=100 Reserves Sailors CS5208: Query Optimization 14 7

  8. Alternative Plans 1 (No Indexes) • Main difference: push selections down • With 5 buffers cost of plan: • With 5 buffers, cost of plan: • Scan Reserves (1000) + write temp T1 (10 pages, sname (On-the-fly) if we have 100 boats, uniform distribution). • Scan Sailors (500) + write temp T2 (250 pages, if we have 10 ratings). (Sort-Merge) sid=sid • Sort T1 (2*2*10), sort T2 (2*4*250), merge (10+250) (10+250) (T1) (T2) rating > 5 bid=100 • Total: 4060 page I/Os. Reserves Sailors CS5208: Query Optimization 15 Alternative Plans 2 (With Indexes) • Clustered index on bid of Reserves • 100,000/100 = 1000 tuples on 1000/100 = 10 pages , / p / p g • Hash index on sid. Join column sid is a key for Sailors. sname(On-the-fly) • INL with pipelining (outer is not materialized) • Project out unnecessary fields from outer doesn’t help. (On-the-fly) rating > 5 • At most one matching tuple, unclustered index on sid OK. (INL sid=sid with pipelining ) • Did not push “rating> 5” before the join. Why? p g j y Sailors bid=100 (Use hash index; do not write result to Reserves temp) CS5208: Query Optimization 16 8

  9. Alternative Plans 2 (With Indexes) • Clustered index on bid of Reserves • 100,000/100 = 1000 tuples on 1000/100 = 10 pages , / p / p g • Hash index on sid. Join column sid is a key for Sailors. sname(On-the-fly) • INL with pipelining (outer is not materialized) • Project out unnecessary fields from outer doesn’t help. (On-the-fly) rating > 5 • At most one matching tuple, unclustered index on sid OK. (INL sid=sid with pipelining ) • Decision not to push rating> 5 before the join is p g j based on availability of sid index on Sailors. Sailors • Cost? bid=100 (Use hash index; do not write Reserves result to temp) CS5208: Query Optimization 17 Alternative Plans 2 (With Indexes) • Clustered index on bid of Reserves • 100,000/100 = 1000 tuples on 1000/100 = 10 pages , / p / p g • Hash index on sid. Join column sid is a key for Sailors. sname(On-the-fly) • INL with pipelining (outer is not materialized) • Project out unnecessary fields from outer doesn’t help. (On-the-fly) rating > 5 • At most one matching tuple, unclustered index on sid OK. (INL sid=sid with pipelining ) • Decision not to push rating> 5 before the join is p g j based on availability of sid index on Sailors. Sailors • Cost: Selection of Reserves tuples (10 I/Os); for bid=100 (Use hash each, must get matching Sailors tuple (1000* 2.2); index; do not write total 2210 I/Os. result to Reserves temp) CS5208: Query Optimization 18 9

  10. Plan Execution under the Iterator Model consumer Open() O () C A B CS5208: Query Optimization 19 Plan Execution under the Iterator Model consumer Open() O () Open() Open() C Open() Open() A B CS5208: Query Optimization 20 10

  11. Plan Execution under the Iterator Model consumer GetNext() G N () GetNext() C A B CS5208: Query Optimization 21 Plan Execution under the Iterator Model consumer GetNext() G N () GetNext() C GetNext() A B CS5208: Query Optimization 22 11

  12. Plan Execution under the Iterator Model consumer GetNext() G N () GetNext() C GetNext() t A B CS5208: Query Optimization 23 Plan Execution under the Iterator Model consumer GetNext() G N () GetNext() C GetNext() GetNext() t A B CS5208: Query Optimization 24 12

  13. Plan Execution under the Iterator Model consumer GetNext() G N () GetNext() C GetNext() GetNext() t A B CS5208: Query Optimization 25 Plan Execution under the Iterator Model consumer G N GetNext() () GetNext() C GetNext() GetNext() t A B CS5208: Query Optimization 26 13

  14. Plan Execution under the Iterator Model consumer GetNext() G N () GetNext() C GetNext() GetNext() t A B CS5208: Query Optimization 27 Plan Execution under the Iterator Model consumer G N GetNext() () GetNext() C GetNext() GetNext() t A B CS5208: Query Optimization 28 14

Recommend


More recommend