review past year question review past year question
play

Review (Past year question) Review (Past year question) Consider - PDF document

Review (Past year question) Review (Past year question) Consider the relations R1(A,B,C), R2(C,D,E) and R3(E,F), with Consider the relations R1(A,B,C), R2(C,D,E) and R3(E,F), primary keys A, C, E respectively. Assume that R1 has 10000


  1. Review (Past year question) Review (Past year question) • Consider the relations R1(A,B,C), R2(C,D,E) and R3(E,F), with • Consider the relations R1(A,B,C), R2(C,D,E) and R3(E,F), primary keys A, C, E respectively. Assume that R1 has 10000 tuples, with primary keys A, C, E respectively. Assume that R1 has R2 has 15000 tuples and R3 has 7500 tuples. For simplicity, assume 10000 tuples, R2 has 15000 tuples and R3 has 7500 tuples. For that all tuples (including the query result) have the same size, and that simplicity, assume that all tuples (including the query result) each page can contain 10 tuples of R. Consider the query: R1 JOIN R2 have the same size, and that each page can contain 10 tuples of JOIN R3. Assume that all attributes are of the same size, and any join R. Consider the query: R1 JOIN R2 JOIN R3. Assume that all output will include all attributes of all relations. Further, assume attrib tes are of the same si e and an join o tp t attributes are of the same size, and any join output will include ill incl de records do not span pages. Assuming the data are uniformly d d i h d if l all attributes of all relations. Further, assume records do not distributed, estimate the result size of the query. span pages. Assuming the data are uniformly distributed, • List all possible plans assuming only left-deep search space is estimate the result size of the query. considered (assuming only one join method). You may assume that cross product are to be avoided. • Number of tuples = 10000 • Compute the cost for each of the above plans you listed to determine • Number of pages = 10000/10 WRONG!! the optimal plan. For simplicity, you may assume that only the nested- • Number of attributes per page = 30; block join is supported, the buffer size is 100 pages, and all • Number of result tuples per page = 30/8 = 3 intermediate results are to be stored in secondary storage. • Number of resultant pages = 10000/3 Review (Past year question) Review (Past year question) • for each plan, compute the cost of each join. there are two • Assume Left Deep Tree plans and one join method. points to note: (a) remember to include the cost to write In total, there are 6 possible plans, but since cross out intermediate results, (b) the number of tuples per page products are not permitted, we end up with 4 plans may be different for each intermediate results. • (R1 JOIN R2) JOIN R3 ( ) • (R2 JOIN R1) JOIN R3 Cost of Plan P1 = Cost (R1 JOIN R2) + I2 Plan P1 Cost (I1 JOIN R3) • (R2 JOIN R3) JOIN R1 • (R3 JOIN R2) JOIN R1 Size (I1) = 10000 tuples; 10000/5 pages I1 R3 Cost (R1 JOIN R2) = 10000/10 + 1000/98*(15000/15)+ 10000/5 R1 R2 Cost(I1 JOIN R3) = join cost + cost to output I2 Structure of a DBMS Query Optimization and Execution Relational Operators Transaction Management Overview g These layers must consider These layers must consider Fil Files and Access Methods d A M th d concurrency Buffer Management control and recovery There are three side effects of acid. (Transaction, Lock, Disk Space Management Enhanced long term memory, Recovery Managers) decreased short term memory, and I forget the third. DB - Timothy Leary CS5208 – Concurrency Control 5 CS5208 – Concurrency Control 6 1

  2. Transactions ACID properties of Transaction Executions • Transaction (“xact”)- DBMS’s abstract view of a user program (or activity): • A sequence of reads and writes of database objects, e.g., a transaction that • A tomicity: All actions in the Xact happen, or none transfers $100 from account A to account B can be expressed as: happen. • Read Account A; • C onsistency: If each Xact is consistent, and the DB • Write Updated Account A ($100 less); • • Read Account B; Read Account B; starts consistent, it ends up consistent. • Write Updated Account B ($100 more); • I solation: Execution of one Xact is isolated from that • Unit of work that must commit or abort as an atomic unit • Transaction Manager controls the execution of transactions. of other Xacts. • User’s program logic is invisible to DBMS! • D urability: If a Xact commits, its effects persist. • Arbitrary computation possible on data fetched from the DB • The DBMS only sees data read/written from/to the DB. CS5208 – Concurrency Control 7 CS5208 – Concurrency Control 8 Atomicity and Durability Transaction Consistency A.C.I.D. A.C.I.D. • A transaction ends in one of two ways: • Transactions preserve DB consistency • commit after completing all its actions • Given a consistent DB state, produce another consistent DB • “commit” is a contract with the caller of the DB state • abort (or be aborted by the DBMS) after executing some actions. • DB Consistency expressed as a set of declarative • Or system crash while the xact is in progress; treat as abort. Integrity Constraints • Two important properties for a transaction: Two important properties for a transaction: • CREATE TABLE/ASSERTION statements • CREATE TABLE/ASSERTION statements • Atomicity : Either execute all its actions, or none of them • E.g. Each CS186 student can only register in one project group. Each group must have 2 students. • Durability : The effects of a committed xact must survive failures. • Application-level • DBMS ensures the above by logging all actions • E.g. Bank account total of each customer must stay the same during a (Recovery): “transfer” from savings to checking account • Undo the actions of aborted/failed transactions. • Transactions that violate ICs are aborted • Redo actions of committed transactions not yet propagated to disk • That’s all the DBMS can automatically check! when system crashes. CS5208 – Concurrency Control 9 CS5208 – Concurrency Control 10 Concurrency Control & Recovery Isolation (Concurrency) A.C.I.D. • Concurrency Control • DBMS interleaves actions of many xacts concurrently • Provide correct and highly available data access in the presence of • Actions = reads/writes of DB objects concurrent access by many users • DBMS ensures xacts do not “step onto” one another. • Recovery • Each xact executes as if it were running by itself • Each xact executes as if it were running by itself. • Ensures database is fault tolerant, and not corrupted by software, • Concurrent accesses have no effect on a Transaction’s behavior system or media failure • Net effect must be identical to executing all transactions for some • 24x7 access to mission critical data serial order. • A boon to application authors! • Users & programmers think about transactions in isolation • Existence of CC&R allows applications to be written without • Without considering effects of other concurrent transactions! explicit concern for concurrency and fault tolerance CS5208 – Concurrency Control 11 CS5208 – Concurrency Control 12 2

  3. Concurrency Control T1 T2 … Tn Concurrency Control Improves latency and latency and throughput DB (consistency Smile, it is the key that fits the constraints) lock of everybody's heart. Anthony J. D'Angelo, The College Blue Book CS5208 – Concurrency Control 13 CS5208 – Concurrency Control 14 Schedule A: Serial Schedule Example: A B T1 T2 25 25 Read(A); A  A+100 T1: Read(A) T2: Read(A) A  A+100 A  A  2 Write(A); 125 Read(B); B  B+100; Write(A) Write(A) Write(B); ( ); 125 125 Read(B) Read(B) Read(A);A  A  2; B  B+100 B  B  2 Write(A); 250 Read(B);B  B  2; Write(B) Write(B) Write(B); 250 Constraint: A=B 250 250 CS5208 – Concurrency Control 15 CS5208 – Concurrency Control 16 Schedule B Schedule C A B A B T1 T2 25 25 T1 T2 25 25 Read(A); A  A+100 Read(A); A  A+100 Write(A); 125 Write(A); 125 Read(A);A  A  2; Read(A);A  A  2; Write(A); i (A) 250 2 0 Write(A); 250 Read(B); B  B+100; Read(B);B  B  2; Write(B); 125 Write(B); 50 Read(B);B  B  2; Read(B); B  B+100; Write(B); 250 Write(B); 150 250 250 250 150 CS5208 – Concurrency Control 17 CS5208 – Concurrency Control 18 3

  4. Same as Schedule C Schedule D What are good schedules? but with new T2’ A B T1 T2’ 25 25 • Want schedules that are “good”, regardless of Read(A); A  A+100 Write(A); 125 Read(A);A  A  1; Write(A); 125 • Only look at order of read and writes Read(B);B  B  1; Write(B); 25 Read(B); B  B+100; Example: Write(B); 125 Sb=r 1 (A)w 1 (A)r 2 (A)w 2 (A)r 1 (B)w 1 (B)r 2 (B)w 2 (B) 125 125 CS5208 – Concurrency Control 19 CS5208 – Concurrency Control 20 What are good schedules? Example: Sb= r1(A)w1(A)r2(A)w2(A)r1(B)w1(B)r2(B)w2(B) • Want schedules that are “good”, regardless of • initial state and T 1 T 2 • transaction semantics T 1 T 2 • Only look at order of read and writes Sb’=r 1 (A)w 1 (A) r 1 (B)w 1 (B)r 2 (A)w 2 (A)r 2 (B)w 2 (B) Example: T 1 T 2 Sb=r 1 (A)w 1 (A)r 2 (A)w 2 (A)r 1 (B)w 1 (B)r 2 (B)w 2 (B) no cycles  Sb is “equivalent” to a serial schedule (in this case T 1 ,T 2 ) CS5208 – Concurrency Control 21 CS5208 – Concurrency Control 22 Example (Cont) Example (Cont) Sd= r 1 (A)w 1 (A)r 2 (A)w 2 (A) r 2 (B)w 2 (B)r 1 (B)w 1 (B) Sd= r 1 (A)w 1 (A)r 2 (A)w 2 (A) r 2 (B)w 2 (B)r 1 (B)w 1 (B) CS5208 – Concurrency Control 23 CS5208 – Concurrency Control 24 4

Recommend


More recommend