Outline • Functional dependencies (3.4) Lecture 09: • Rules about FDs (3.5) • Design of a Relational schema (3.6) Friday, October 18, 2002 1 2 Functional Dependencies Formal definition of a key • A key is a set of attributes A 1 , ..., A n s.t. for Definition: A 1 , ..., A m � B 1 , ..., B n holds in R if: any other attribute B, A 1 , ..., A n � B ∀ t, t’ ∈ R, (t.A 1 =t’.A 1 ∧ ... ∧ t.A m =t’.A m � t.B 1 =t’.B 1 ∧ ... ∧ t.B m =t’.B m ) R • A minimal key is a set of attributes which A1 ... Am B1 ... Bm is a key and for which no subset is a key t t’ • Note: book calls them superkey and key 3 4 if t, t’ agree here then t, t’ agree here Examples of Keys Finding the Keys of a Relation • Product(name, price, category, color) Given a relation constructed from an E/R diagram, what is its key? name, category � price Rules: category � color 1. If the relation comes from an entity set, the key of the relation is the set of attributes which is the Keys are: {name, category} and all supersets key of the entity set. • Enrollment(student, address, course, room, time) student � address Person( address, name, ssn ) room, time � course Person student, course � room, time name ssn address Keys are: [in class] 5 6 1
Finding the Keys Finding the Keys Rules: Except: if there is an arrow from the relationship to E, then 2. If the relation comes from a many-many relationship, we don’t need the key of E as part of the relation key. the key of the relation is the set of all attribute keys in the relations corresponding to the entity sets sname Product name name Purchase Store Person buys Product price name ssn card-no CreditCard Person ssn date Purchase (name , sname, ssn, card-no) buys (name, ssn, date) 7 8 Expressing Dependencies Finding the Keys Say: “the CreditCard determines the Person” More rules in the book – please read ! sname Product name Purchase Store Incomplete card-no (what does CreditCard Person ssn it say ?) Purchase (name , sname, ssn, card-no) card-no � name 9 10 Inference Rules for FD’s Inference Rules for FD’s (continued) Trivial Rule A , A , … A A A , A , … A B , B , … B Splitting rule 1 2 n 1 2 m i 1 2 n and where i = 1, 2, ..., n Is equivalent to Combing rule B A , A , … A 1 1 2 n B A1 ... Am B1 ... Bm A , A , … A 2 1 2 n A1 ... Am … Why ? B A , A , … A m 1 2 n 11 12 2
Inference Rules for FD’s (continued) Transitive Closure Rule A1 ... Am B1 ... Bm C 1 ... C p If A , A , … A B , B …, B 1 2 n 1 2 m and B , B , … B C , C …, C 1 2 m 1 2 p A , A , … A C , C …, C then 1 2 n 1 2 p Why ? 13 14 Closure of a set of Attributes Given a set of attributes { A1, …, An} and a set of dependencies S. • Enrollment(student, major, course, room, time) Problem : find all attributes B such that: student � major any relation which satisfies S also satisfies: major, course � room A1, …, An B course � time The closure of {A1, …, An}, denoted {A1, …, An} , + is the set of all such attributes B What else can we infer ? [in class] 15 16 Closure Algorithm Example R(A,B,C,D,E,F) A B C Start with X={A1, …, An}. A D E B D Repeat until X doesn’t change do : A F B if is in S, and C B , B , … B 1 2 n Closure of {A,B}: X = {A, B, } are all in X, and B , B , … B n 1 2 C is not in X Closure of {A, F}: X = {A, F, } then add C to X. 17 18 3
Relational Schema Design Why Is the Algorithm Correct ? (or Logical Design) • Show the following by induction: Main idea: – For every B in X : • Start with some relational schema • A1, …, An B • Initially X = {A1, …, An} -- holds • Find out its FD’s • Induction step: B1, …, Bm in X • Use them to design a better relational – Implies A1, …, An B1, …, Bm schema – We also have B1, …, Bm C – By transitivity we have A1, …, An C • This shows that the algorithm is sound ; need to show it is complete 19 20 Relational Schema Design Relational Schema Design (or Logical Design) Recall set attributes (persons with several phones): When a database is poorly designed we get Name SSN PhoneNumber City anomalies: Fred 123-45-6789 206-555-1234 Seattle Fred 123-45-6789 206-555-6543 Seattle • Redundancy: data is repeated Joe 987-65-4321 908-555-2121 Westfield Joe 987-65-4321 908-555-1234 Westfield • Updated anomalies: need to change in SSN � Name, City, but not SSN � PhoneNumber several places Anomalies: • Delete anomalies: may lose data when we • Redundancy = repeat data don’t want • Update anomalies = Fred moves to “Bellvue” • Deletion anomalies = Fred drops all phone numbers: what is his city ? 21 22 Relation Decomposition Relational Schema Design Break the relation into two: name Conceptual Model: Person Product buys Name SSN City price name ssn Fred 123-45-6789 Seattle Joe 987-65-4321 Westfield Relational Model: SSN PhoneNumber plus FD’s 123-45-6789 206-555-1234 123-45-6789 206-555-6543 987-65-4321 908-555-2121 987-65-4321 908-555-1234 Normalization: Eliminates anomalies 23 24 4
Decompositions in General Incorrect Decomposition R(A 1 , ..., A n ) • Sometimes it is incorrect: Create two relations R1(B1, ..., Bm) and R2(C1, ..., Cp) Name Price Category Gizmo 19.99 Gadget such that: B1, ..., Bm ∪ C1, ..., Cp = A1, ..., An OneClick 24.99 Camera DoubleClick 29.99 Camera and: R 1 = projection of R on B 1 , ..., B m R 2 = projection of R on C 1 , ..., C p Decompose on : Name, Category and Price, Category 25 26 Incorrect Decomposition Normal Forms First Normal Form = all attributes are atomic Name Category Price Category Second Normal Form (2NF) = old and obsolete Gizmo Gadget 19.99 Gadget OneClick Camera 24.99 Camera Third Normal Form (3NF) = this lecture DoubleClick Camera 29.99 Camera Name Price Category Boyce Codd Normal Form (BCNF) = this lecture Gizmo 19.99 Gadget When we put it back: OneClick 24.99 Camera OneClick 29.99 Camera Others... Cannot recover information DoubleClick 24.99 Camera DoubleClick 29.99 Camera 27 28 Boyce-Codd Normal Form Example A simple condition for removing anomalies from relations: Name SSN PhoneNumber City Fred 123-45-6789 206-555-1234 Seattle A relation R is in BCNF if: A relation R is in BCNF if: Fred 123-45-6789 206-555-6543 Seattle Joe 987-65-4321 908-555-2121 Westfield Joe 987-65-4321 908-555-1234 Westfield Whenever there is a nontrivial dependency A 1 , ..., A n � B Whenever there is a nontrivial dependency A 1 , ..., A n � B in R , {A 1 , ..., A n } is a key for R in R , {A 1 , ..., A n } is a key for R What are the dependencies? SSN � Name, City In English (though a bit vague): What are the keys? {Name, SSN, PhoneNumber} Whenever a set of attributes of R is determining another attribute, Is it in BCNF? should determine all the attributes of R . 29 30 5
Summary of BCNF Decompose it into BCNF Decomposition Find a dependency that violates the BCNF condition: Name SSN City SSN � Name, City A , A , … A B , B , … B Fred 123-45-6789 Seattle 1 2 m 1 2 n Joe 987-65-4321 Westfield Heuristics: choose B , B , … B “as large as possible” 1 2 m SSN PhoneNumber Decompose: 123-45-6789 206-555-1234 123-45-6789 206-555-6543 Continue until 987-65-4321 908-555-2121 Others A’s B’s there are no Is there a 987-65-4321 908-555-1234 BCNF violations 2-attribute left. relation that is R1 R2 31 not in BCNF ? 32 Example Decomposition Other Example Person(name, SSN, age, hairColor, phoneNumber) SSN � name, age • R(A,B,C,D) A B, B C age � hairColor Decompose in BCNF (in class): • Key: A, D • Violations of BCNF: A B, A C, A BC Step 1: find all keys • Pick A BC: split into R1(A,BC) R2(A,D) • What happens if we pick A B first ? Step 2: now decompose 33 34 Correct Decompositions Correct Decompositions A decomposition is lossless if we can recover: • Given R(A,B,C) s.t. A � B, the R(A,B,C) decomposition into R1(A,B), R2(A,C) is Decompose lossless R1(A,B) R2(A,C) Recover R’(A,B,C) should be the same as R(A,B,C) R’ is in general larger than R. Must ensure R’ = R 35 36 6
Recommend
More recommend