relational algebra
play

Relational Algebra Murali Mani What is Relational Algebra? Defines - PDF document

Relational Algebra Murali Mani What is Relational Algebra? Defines operations (data retrieval) for relational model SQLs DML (Data Manipulation Language) has data retrieval facilities, which are equivalent to that of relational


  1. Relational Algebra Murali Mani What is Relational Algebra? � Defines operations (data retrieval) for relational model � SQL’s DML (Data Manipulation Language) has data retrieval facilities, which are equivalent to that of relational algebra. � SQL and relational algebra are not for complex operations; they support efficient, easy access of large data sets. Murali Mani 1

  2. Basics � Relational Algebra is defined on bags , rather than relations (sets). � Bag or multiset allows duplicate values; but order is not significant. � We can write an expression using relational algebra operators with parentheses � Need closure – an operator on bag returns a bag. � Relational algebra includes set operators, and other operators specific to relational model. Murali Mani Set Operators � Union, Intersection, Difference, cross product � Union, Intersection and Difference are defined only for union compatible relations. � Two relations are union compatible if they have the same set of attributes and the types (domains) of the attributes are the same. � Eg of two relations that are not union compatible: � Student (sNumber, sName) � Course (cNumber, cName) Murali Mani 2

  3. Union: ∪ � Consider two bags R 1 and R 2 that are union- compatible. Suppose a tuple t appears in R 1 m times, and in R 2 n times. Then in the union, t appears m + n times. R 1 ∪ R 2 R 1 R 2 A B 1 2 A B A B 1 2 1 2 1 2 1 2 3 4 3 4 3 4 1 2 5 6 3 4 5 6 Murali Mani Intersection: ∩ � Consider two bags R 1 and R 2 that are union- compatible. Suppose a tuple t appears in R 1 m times, and in R 2 n times. Then in the intersection, t appears min ( m, n) times. R 1 R 2 R 1 ∩ R 2 A B A B A B 1 2 1 2 1 2 3 4 3 4 3 4 1 2 5 6 Murali Mani 3

  4. Difference: - � Consider two bags R 1 and R 2 that are union- compatible. Suppose a tuple t appears in R 1 m times, and in R 2 n times. Then in R 1 – R 2 , t appears max (0, m - n) times. R 1 R 2 R 1 – R 2 A B A B A B 1 2 1 2 1 2 3 4 3 4 1 2 5 6 Murali Mani Bag semantics vs Set semantics � Union is idempotent for sets: (R1 ∪ R2) ∪ R2 = R1 ∪ R2 � Union is not idempotent for bags. � Intersection is idempotent for sets and bags. � Difference is idempotent for sets, but not for bags. � For sets and bags, R 1 ∩ R 2 = R 1 – (R 1 – R 2 ) . Murali Mani 4

  5. Cross Product (Cartesian Product): X � Consider two bags R 1 and R 2 . Suppose a tuple t 1 appears in R 1 m times, and a tuple t 2 appears in R 2 n times. Then in R 1 X R 2 , t 1 t 2 appears mn times. R 1 X R 2 R 1 R 2 A R 1 .B R 2 .B C 1 2 2 3 A B B C 1 2 2 3 1 2 2 3 1 2 4 5 1 2 4 5 1 2 4 5 4 5 1 2 4 5 1 2 4 5 Murali Mani Basic Relational Operations � Select, Project, Join � Select: denoted σ C (R) : selects the subset of tuples of R that satisfies selection condition C. C can be any boolean expression, its clauses can be combined with AND, OR, NOT. σ (C ≥ 6) (R) R A B C A B C 1 2 5 3 4 6 3 4 6 1 2 7 1 2 7 1 2 7 1 2 7 Murali Mani 5

  6. Select � Select is commutative: σ C2 ( σ C1 (R)) = σ C1 ( σ C2 (R)) � Select is idempotent: σ C ( σ C (R)) = σ C (R) � We can combine multiple select conditions into one condition. σ C1 ( σ C2 (… σ Cn (R)…)) = σ C1 AND C2 AND … Cn (R) Murali Mani Project: π A1, A2, …, An (R) � Consider relation (bag) R with set of attributes A R . π A1, A2, …, An (R), where A1, A2, …, An ∈ A R returns the tuples in R, but only with columns A1, A2, …, An. π A, B (R) R A B C A B 1 2 5 1 2 3 4 6 3 4 1 2 7 1 2 1 2 8 1 2 Murali Mani 6

  7. Project: Bag Semantics vs Set Semantics � For bags, the cardinality of R = cardinality of π A1, A2, …, An (R). � For sets, cardinality of R ≥ cardinality of π A1,A2, …, An (R). � For sets and bags � project is not commutative � project is idempotent Murali Mani Natural Join: R ⋈ S � Consider relations (bags) R with attributes A R , and S with attributes A S . Let A = A R ∩ A S . R ⋈ S can be defined as π A R – A, A, A S - A ( σ R.A1 = S.A1 AND R.A2 =S.A2 AND … R.An=S.An (R X S)) where A = {A1, A2, …, An} The above expression says: select those tuples in R X S that agree in values for each of the A attributes, and project the resulting tuples such that we have only one value for each A attribute. Murali Mani 7

  8. Natural Join example R 1 R 2 R 1 ⋈ R 2 A B B C A B C 1 2 2 3 1 2 3 1 2 4 5 1 2 3 4 5 Murali Mani Theta Join: R ⋈ C S � Theta Join is similar to cartesian product, except that we can specify any condition C. It is defined as ⋈ R C S = ( σ C (R X S)) ⋈ R 1 R1.B<R2.B R 2 R 1 R 2 A R 1 .B R 2 .B C 1 2 4 5 A B B C 1 2 4 5 1 2 2 3 1 2 4 5 1 2 4 5 1 2 4 5 4 5 Murali Mani 8

  9. Outer Join: R ⋈ o S � Similar to natural join, however, if there is a tuple in R, that has no “matching” tuple in S, or a tuple in S that has no matching tuple in R, then that tuple also appears, with null values for attributes in S (or R). R 1 ⋈ o R 2 R 1 R 2 A B C D 1 2 3 10 A B C B C D 1 2 3 11 1 2 3 2 3 10 4 5 6 null 4 5 6 2 3 11 7 8 9 null 7 8 9 6 7 12 null 6 7 12 Murali Mani Left Outer Join: R ⋈ o L S � Similar to natural join, however, if there is a tuple in R, that has no “matching” tuple in S, then that tuple also appears, with null values for attributes in S (note: a tuple in S that has no matching tuple in R does not appear). R 1 ⋈ o L R 2 R 1 R 2 A B C D A B C B C D 1 2 3 10 1 2 3 2 3 10 1 2 3 11 4 5 6 2 3 11 4 5 6 null 7 8 9 6 7 12 7 8 9 null Murali Mani 9

  10. Right Outer Join: R ⋈ o R S � Similar to natural join, however, if there is a tuple in S, that has no “matching” tuple in R, then that tuple also appears, with null values for attributes in R (note: a tuple in R that has no matching tuple in S does not appear). R 1 ⋈ o R R 2 R 1 R 2 A B C D A B C B C D 1 2 3 10 1 2 3 2 3 10 1 2 3 11 4 5 6 2 3 11 null 6 7 12 7 8 9 6 7 12 Murali Mani Renaming: ρ S(A1, A2, …, An) (R) � Rename relation R to S, attributes of R are renamed to A1, A2, …, An � ρ S (R) renames relation R to S, keeping the attributes same. ρ S(X, C, D) (R 2 ) ρ S (R 2 ) R 2 S S B C D X C D B C D 2 3 10 2 3 10 2 3 10 2 3 11 2 3 11 2 3 11 6 7 12 6 7 12 6 7 12 Murali Mani 10

  11. Example: Introducing new relations Find the semijoin of 2 relations R, S. Semijoin denoted R ⋉ S is defined as the tuples in R, such that for a tuple t1 in R, if there exists a tuple t2 in S, and t1 and t2 agree in all attributes common to R and S, then t1 appears in the result. R1 = R ⋈ S R2 = π A R (R1) R ⋉ S = R2 ⋂ R Murali Mani Duplicate Elimination: δ (R) � Convert a bag to a set. δ (R) R A B A B 1 2 1 2 3 4 3 4 1 2 1 2 Murali Mani 11

  12. Extended Projection: π L (R) � Here L can be � An attribute (just like simple projection) � An expression x → y, where x and y are names of attributes, this renames attribute x to y. � An expression E → z, where E is any expression involving attributes, constants, and arithmetic and string operators. This has an attribute called z whose values are given by E. π B → A, C+D → X, C, D (R) R A X C D B C D 2 13 3 10 2 3 10 2 14 3 11 2 3 11 6 19 7 12 6 7 12 Murali Mani Aggregation operators � MIN, MAX, COUNT, SUM, AVG � AGG B (R) considers only non-null values of R. SUM B (R) MIN B (R) COUNT B (R) R SUM B (R) MIN B (R) COUNT B (R) A B 9 2 3 1 2 3 4 AVG B (R) COUNT * (R) 1 null MAX B (R) 1 3 AVG B (R) COUNT * (R) MAX B (R) 3 4 4 Murali Mani 12

  13. Aggregation Operators � MIN, MAX, SUM, AVG must be on any 1 attribute. COUNT can be on any 1 attribute or COUNT * (R) � An aggregation operator returns a bag, not a single value ! But SQL allows treatment as a single value. σ B=MAXB (R) (R) A B 3 4 Murali Mani Grouping Operator: γ GL, AL (R) � γ GL, AL (R) groups all attributes in GL, and performs the aggregation specified in AL. γ starName, MIN (year) → year, COUNT(title) → num (StarsIn) starName year num HF 77 3 StarsIn KR 94 2 title year starName SW1 77 HF Matrix 99 KR 6D&7N 93 HF SW2 79 HF Speed 94 KR Murali Mani 13

  14. Sorting Operator: τ L (R) � It sorts the tuples in R. If L is list A1, A2, …, An, it first sorts by A1, then by A2, and so on. � Sort is used as a last operator in an expression. τ A,B (R) R A B C A B C 1 2 5 1 2 5 3 1 6 1 2 7 1 2 7 1 3 8 1 3 8 3 1 6 Murali Mani Relational Algebra Operators � Set Operators: Union, Intersection, Difference, Cartesian Product � Select, Project � Join: Natural Join, Theta Join, (Left/Right) Outer Join � Renaming, Duplicate Elimination � Aggregation: MIN, MAX, COUNT, SUM, AVG � Grouping, Sorting Murali Mani 14

Recommend


More recommend