Quiz time! What’s wrong with this schema? Database design III Courses(code, period, name, teacher) code � name code, period � teacher Functional dependencies cont. {(’TDA356’, 2, ’Databases’, ’Niklas Broberg’), BCNF and 3NF (’TDA356’, 4, ’Databases’, ’Rogardt Heldal’)} MVDs and 4NF Redundancy! Using FDs to detect anomalies Decomposition Courses(code, period, name, teacher) • Whenever X � A holds for a relation R, code � name code, period � teacher but X is not a key for R, then values of A • Fix the problem by decomposing Courses: will be redundantly repeated! – Create one relation with the attributes from the offending FD, in this case code and name . Courses(code, period, name, teacher) – Keep the original relation, but remove all attributes from the RHS of the FD. Insert a reference from the LHS in this relation, to the {(’TDA356’, 2, ’Databases’, ’Niklas Broberg’), key in the first. (’TDA356’, 4, ’Databases’, ’Rogardt Heldal’)} Courses(code, name) code � name code, period � teacher CoursePeriods(code, period, teacher) What? code -> Courses.code Quiz: What kind of anomaly could this relational schema lead to? Boyce-Codd Normal Form BCNF violations • We say that an FD X � A violates BCNF with • A relation R is in Boyce-Codd Normal respect to relation R if X � A holds on R, but X Form (BCNF) if, whenever a nontrivial FD is not a superkey of R. X � A holds on R, X is a superkey of R. – Remember: nontrivial means A is not part of X Example: – Remember: a superkey is any superset of a key (including the keys themselves). Courses(code, period, name, teacher) code � name violates BCNF Courses(code, name) CoursePeriods(code, period, teacher) code, period � teacher does not. 1
BCNF normalization Quiz! • Algorithm: Given a relation R and FDs F. Decompose Courses into BCNF. 1. Identify new FDs using the transitive rule, and add these to F. Courses(code, period, name, teacher) 2. Look among the FDs in F for a violation code � name X � A of BCNF w.r.t. R. code, period � teacher 3. Decompose R into two relations {code} + = {code, name} One relation RX containing all the attributes in X + . – The original relation R, except the values in X + that are not – Courses(code, name) also in X (i.e. R – X + + X), and with a reference from X to X CoursePeriods(course, period, teacher) in RX. course -> Courses.code 4. Repeat from 2 for the two new relations until there are no more violations. No BCNF violations left, so we’re done! Quiz again! Recovery • We must be able to recover the original data after Why not use BCNF decomposition for designing decomposition. database schemas? Why go via E-R code per name teacher diagrams? TDA357 2 Databases Niklas Broberg – Decomposition doesn’t handle all situations TDA357 4 Databases Rogardt Heldal gracefully. E.g. • Self-relationships code per teacher code name • Many-to-one vs. many-to-”exactly one” + TDA357 2 Niklas Broberg TDA357 Databases • Subclasses TDA357 4 Rogardt Heldal • Single-attribute entities – E-R diagrams are graphical, hence easier to sell code per name teacher than some mathematical formulae. TDA357 2 Databases Niklas Broberg TDA357 4 Databases Rogardt Heldal ”Lossy join” Lossless join Let’s try to split on non-existent code � teacher • Only if we decompose on proper dependencies can we guarantee that no code per name teacher TDA357 2 Databases Niklas Broberg facts are lost. TDA357 4 Databases Rogardt Heldal – Schemas from proper translation of E-R diagrams get this ”for free”. code teacher code per name + TDA357 Niklas Broberg TDA357 2 Databases – The BCNF decomposition algorithm TDA357 Rogardt Heldal TDA357 4 Databases guarantees lossless join. code per name teacher • A decompositon that does not give TDA357 2 Databases Niklas Broberg lossless join is bad. What? TDA357 4 Databases Niklas Broberg TDA357 2 Databases Rogardt Heldal TDA357 4 Databases Rogardt Heldal 2
Teaches(teacher, course) Example of BCNF decomposition: course -> Courses.code CoursePeriods(period, teacher) CoursePeriods(course, period, teacher) teacher -> Teaches.teacher course -> Courses.code Two keys: course, period � teacher {course, period} teacher � course {teacher, period} Violation! teacher course per teacher Decompose: Niklas Broberg TDA357 2 Niklas Broberg Rogardt Heldal TDA357 2 Rogardt Heldal Teaches(teacher, course) course -> Courses.code course per teacher CoursePeriods(period, teacher) TDA357 2 Niklas Broberg teacher -> Teaches.teacher TDA357 2 Rogardt Heldal Quiz: What just went wrong? course, period � teacher ?? Problem with BCNF Third Normal Form (3NF) • Some structures cause problems for • 3NF is a weakening of BCNF that handles decomposition. this situation. – AB � C, C � B – An attribute is prime in relation R if it is a – Decomposing w.r.t. C � B gives two relations, member of any key of R. containing {C,B} and {A,C} respectively. This means – Non-trivial X � A violates BCNF for R if X is we can no longer enforce AB � C! not a superkey of R. – Intuitively, the cause of the problem is that we must split the LHS of AB � C over two different relations. – Non-trivial X � A violates 3NF for R if X is not • Not quite the full truth, but good enough. a superkey or R, and A is not prime in R. Third Normal Form (3NF) Different algorithm for 3NF • Given a relation R and a set of FDs F: – Compute the minimal basis of F. ” A nonkey field must provide a fact about the • Minimal basis means F, except remove A � C if you have A � B and B � C. key, the whole key and nothing but the – Group together FDs with the same LHS. key, so help me Codd ” – For each group, create a relation with the LHS as the key. Edgar F. (Ted) Codd was the inventor of the – If no relation contains a key of R, add one relational data model. relation containing only a key of R. 3
Example: Courses(code, period, name, teacher) Earlier example revisited: code � name Two keys: CoursePeriods(course, period, teacher) code, period � teacher {code, period} course -> Courses.code teacher � code {teacher, period} Two keys: course, period � teacher {course, period} teacher � course {teacher, period} Decompose: Courses(code, name) Since all attributes are members of some key, i.e. CoursePeriods(course, period, teacher) all attributes are prime, there are no 3NF course -> Courses.code violations. Hence CoursePeriods is in 3NF. teacher -> Teaches.name Teaches(name, course) course -> Courses.code Quiz: What’s the problem now then? CoursePeriods contains a key for the original Courses relation, so we have finished. Almost? 3NF vs BCNF Example: • Three important properties of Courses(code, name, room, teacher) decomposition: code � name code room teacher 1. Recovery (loss-less join) TDA357 VR Niklas Broberg 2. No redundancy TDA357 VR Rogardt Heldal code name 3. Dependency preservation TDA357 HC1 Niklas Broberg TDA357 Databases • 3NF guarantees 1 and 3, but not 2. TDA357 HC1 Rogardt Heldal • BCNF guarantees 1 and (almost) 2, but These two relations are in BCNF, but there’s lots of not 3. redundancy! Quiz: Why? Let’s start from the bottom… Compare with E/R name code room code teacher LecturesIn(code, teacher, room) TDA357 HC1 TDA357 Niklas Broberg code -> Courses.code Teacher room -> Rooms.name TDA357 VR TDA357 Rogardt Heldal teacher -> Teachers.name code name Course LecturesIn Room code room teacher TDA357 VR Niklas Broberg name TDA357 VR Rogardt Heldal LecturesIn(code, room) TDA357 HC1 Niklas Broberg code -> Courses.code Gives Teacher room -> Rooms.name TDA357 HC1 Rogardt Heldal name Gives(code, teacher) code -> Courses.code • No redundancy before join the two independent relations Course Room teacher -> Teachers.name LecturesIn • The two starting relations are what we really want to have code 4
Recommend
More recommend