relational model gaps between data and db design
play

Relational Model Gaps between Data and DB Design Conceptually, - PDF document

Relational Model Gaps between Data and DB Design Conceptually, what is a relational database? How to rewrite a query so that it can be evaluated correctly and efficiently? How can an ER design be reduced to a relational


  1. Relational Model

  2. Gaps between Data and DB Design • Conceptually, what is a relational database? – How to rewrite a query so that it can be evaluated correctly and efficiently? • How can an ER design be reduced to a relational implementation? – In ER design, we have entity sets and relationship sets – In a relational database, we have only tables – How to fill the gap? CMPT 354: Database I -- Relational Model 2

  3. Example of a Relation Attribute Tuple/record CMPT 354: Database I -- Relational Model 3

  4. Relation as a Math Structure • Formally, given sets D 1 , D 2 , …. D n , a relation r is a subset of D 1 x D 2 x … x D n – A relation is a set of n-tuples {(a 1 , a 2 , …, a n )} where each a i ∈ D i customer_name = {Jones, Smith, Curry, Lindsay} customer_street = {Main, North, Park} customer_city = {Harrison, Rye, Pittsfield} r = { (Jones, Main, Harrison), (Smith, North, Rye), (Curry, North, Rye), (Lindsay, Park, Pittsfield) } is a relation over customer_name x customer_street x customer_city CMPT 354: Database I -- Relational Model 4

  5. Attribute Types • Each attribute of a relation has a name • The set of allowed values for each attribute is called the domain of the attribute • Attribute values are (normally) required to be atomic; that is, indivisible – Multivalued attribute values and composite attribute values are not atomic • The special value null is a member of every domain – We will ignore the effect of null values in our main presentation and consider their effect later CMPT 354: Database I -- Relational Model 5

  6. Relation Schema • A 1 , A 2 , …, A n are attributes, R = (A 1 , A 2 , …, A n ) is a relation schema – Customer_schema = (customer_name, customer_street, customer_city) • r(R) is a relation on the relation schema R – customer (Customer_schema) CMPT 354: Database I -- Relational Model 6

  7. Relation Instance • The current values (relation instance) of a relation are specified by a table • An element t of r is a tuple, represented by a row in a table attributes (or columns) customer_name customer_street customer_city Jones Main Harrison tuples Smith North Rye (or rows) Curry North Rye Lindsay Park Pittsfield customer CMPT 354: Database I -- Relational Model 7

  8. Relations are Unordered • Order of tuples is irrelevant (tuples may be stored in an arbitrary order) – account relation with unordered tuples CMPT 354: Database I -- Relational Model 8

  9. Relational Databases • A database consists of multiple relations • Information about an enterprise is broken down into parts, with each relation storing one piece of the information – account: stores information about accounts – depositor: stores information about which customer owns which account – customer: stores information about customers CMPT 354: Database I -- Relational Model 9

  10. Universal Table • Storing all information as a single relation such as bank(account_number, balance, customer_name, …) • Advantages: an easy starting point • Disadvantages: – Repetition/redundancy of information (e.g., if two customers share an account, the balance is repeated in the two tuples) – The need for null values (e.g., to represent a customer without an account) CMPT 354: Database I -- Relational Model 10

  11. Relation Examples CMPT 354: Database I -- Relational Model 11

  12. Keys • Let K ⊆ R. K is a superkey of R if values for K are sufficient to identify a unique tuple of each possible relation r(R) – {customer_name, customer_street} and {customer_name} are both superkeys of Customer, if no two customers can possibly have the same name • K is a candidate key if K is minimal – {customer_name} is a candidate key for Customer • Primary Key: a candidate key chosen by the database designer as the principal means of identifying tuples within a relation CMPT 354: Database I -- Relational Model 12

  13. Schema Diagram • Foreigh key: reference to the primary key of another table CMPT 354: Database I -- Relational Model 13

  14. Query Languages • Languages in which users request information from the database • Categories of languages – Procedural: specifying how to find the answers – Non-procedural, or declarative: only specifying what should be the answers, but not how to find them • “Pure” languages: form underlying basis of query languages that commercial systems use – Relational algebra (procedural) – Tuple relational calculus (non-procedural) – Domain relational calculus (non-procedural) CMPT 354: Database I -- Relational Model 14

  15. Relational Algebra • A procedural language • Six basic operators – select: σ – project: ∏ – union: ∪ – set difference: – – Cartesian product: x – rename: ρ • The operators take one or two relations as inputs and produce a new relation as the result CMPT 354: Database I -- Relational Model 15

  16. Select Operation – Example • σ A=B ^ D > 5 (r) A B C D A B C D α α 1 7 α α α β 1 7 5 7 β β β β 23 10 12 3 β β 23 10 CMPT 354: Database I -- Relational Model 16

  17. Select Operation – Definition • σ p (r), p is called the selection predicate – σ p (r) = {t | t ∈ r and p(t)} – p is a formula in propositional calculus consisting of terms connected by : ∧ (and), ∨ (or), ¬ (not) – Each term is one of: <attribute> op <attribute> or <constant>, where op is one of: =, ≠ , >, ≥ . <. ≤ • Example: σ branch_name=“Perryridge” (account) CMPT 354: Database I -- Relational Model 17

  18. Project Operation – Example • ∏ A,C ( r ) A B C A C A C α α α 1 10 1 1 β α α = 1 20 1 1 β β β 2 30 1 1 β β 40 2 2 CMPT 354: Database I -- Relational Model 18

  19. Project Operation – Definition ∏ • , where A1, A2 are attribute names ( r ) K A , A , , A 1 2 k and r is a relation name – The result is defined as the relation of k columns obtained by erasing the columns that are not listed – Duplicate rows removed from result, since relations are sets • Example: To eliminate the branch_name attribute of account – ∏ account_number, balance (account) CMPT 354: Database I -- Relational Model 19

  20. Union Operation – Example A B A B A B α α α 1 1 2 α α β 2 2 3 β β 1 1 s r β 3 r ∪ s CMPT 354: Database I -- Relational Model 20

  21. Union Operation – Definition • r ∪ s = {t | t ∈ r or t ∈ s} – r and s must have the same arity (same number of attributes) – The attribute domains must be compatible (e.g., the 2nd column of r deals with the same type of values as does the 2nd column of s) • Example: to find all customers with either an account or a loan – ∏ customer_name (depositor) ∪ ∏ customer_name (borrower) CMPT 354: Database I -- Relational Model 21

  22. Set Difference Operation – Example A B A B A B α α α 1 2 1 α β β 2 3 1 r – s β 1 s r CMPT 354: Database I -- Relational Model 22

  23. Set Difference Operation – Definition • r – s = {t | t ∈ r and t ∉ s} • Set differences must be taken between compatible relations – r and s must have the same arity – Attribute domains of r and s must be compatible CMPT 354: Database I -- Relational Model 23

  24. Cartesian-Product Operation – Example A B C D E A B C D E α α α α 1 10 a 10 a 1 α β β 1 10 a 10 a β 2 α β β 1 20 b 20 b r α γ γ 1 10 b 10 b β α 2 10 a s β β 2 10 a β β 2 20 b β γ 2 10 b r x s CMPT 354: Database I -- Relational Model 24

  25. Cartesian-Product Operation – Definition • r x s = {t q | t ∈ r and q ∈ s} – Attributes of r(R) and s(S) are disjoint. (That is, R ∩ S = ∅ ) • If attributes of r(R) and s(S) are not disjoint, then renaming must be used CMPT 354: Database I -- Relational Model 25

  26. Summary • Relational model – Representing data in relations • Relational algebra – mathematical foundation for SQL • Basic operations in relational algebra CMPT 354: Database I -- Relational Model 26

  27. To-Do List • Can you translate the relational algebra examples into SQL? What can you observe? CMPT 354: Database I -- Relational Model 27

Recommend


More recommend