0872c6500a7c5e0cc3b42c485fc6ff30.ppt
- Количество слайдов: 40
CS 319: Theory of Databases Dr. A. I. Cristea http: //www. dcs. warwick. ac. uk/~acristea/
… previous Armstrong axioms 2
Content 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Generalities DB Integrity constraints (FD revisited) LLJ, DP and applications Relational Algebra (revisited) Query optimisation Temporal Data The Askew Wall Tuple calculus Domain calculus Query equivalence 3
Lossless Join Decomposition • Lossless Join Definition: – Let { R 1 , R 2 } be a decomposition of R (meaning that R 1 R 2 = R); the decomposition is lossless if for every legal instance r of R: r = R 1(r) R 2(r) • What is wrong with the following decomposition? – R = {A, B, C} and F = { A B, C B} and we replace R by { R 1 , R 2 } where R 1 = {A, B} and R 2 = {C, B}. 4
Sufficient Condition for Lossless Join • Lossless Join means: – Let { R 1 , R 2 } be a decomposition of R (meaning that R 1 R 2 = R); • Prove that for all legal instances r: r R 1(r) R 2(r) • Prove that this decomposition is lossless if R 1 R 2 R 1 or R 1 R 2 – Can you give an example of a lossless join decomposition (instance) when neither R 1 R 2 R 1 nor R 1 R 2 hold? 5
Boyce-Codd Normal Form (BCNF) • A relation scheme R is in BCNF if (and only if) for every non-trivial fd X Y F+, X is a superkey (for R). • A database scheme D = {R 1, . . . , Rn} is in BCNF if (and only if) i {1, . . . , n}: Ri is in BCNF. • Let R = {A, B, C} and F = { A B, C B} and let us decompose R into by { R 1 , R 2 } where R 1= {A, B} and R 2 = {C, B}. Is this decomposition in BCNF? Is this the “best” decomposition in BCNF? (Can you find a better one? ) 6
BCNF Decomposition Algorithm result : = {R}; done : = false; compute F+; while (not done) do if (there is a schema Ri in result that is not in BCNF) then begin let α β be a nontrivial functional dependency that holds on Ri such that α Ri is not in F+, and α β= ; result : = (result – Ri) (Ri – β) (α, β); end else done: = true; 7
Dependencies in a decomposition • Which dependencies hold in R 1 and R 2? – R = {A, B, C} and F = { A B, B C} and we replace R by { R 1 , R 2 } where R 1 = {A, B} and R 2 = {B, C}. – R = {A, B, C} and F = { A B, C B} and we replace R by { R 1 , R 2 } where R 1 = {A, B} and R 2 = {A, C}. – R = {A, B, C} and F = { A B, B C} and we replace R by { R 1 , R 2 } where R 1 = {A, B} and R 2 = {A, C} 8
Third Normal Form (3 NF) • Third Normal Form – Informal Presentation – Example and Discussion – Formal Definition • 3 NF Decomposition Algorithm – Principle and Properties • Lossless-join, dependency-preserving decomposition into 3 NF – Proof of Correctness – Example of 3 NF Decomposition • Third Normal Form and Boyce-Codd Normal Form 9
Informal Presentation • Motivation – There are some situations where • BCNF decomposition is not dependency preserving, and • Efficient checking for FD violation on updates is important – Solution • Define a weaker normal form, called Third Normal Form – FDs can be checked on individual relations without computing a join – There is always a lossless-join, dependency-preserving decomposition into 3 NF 10
Informal Presentation • Motivation – Sometimes a relational schema and its FDs are not in BCNF but one does not want to decompose it further Bookings(title, theater, city) – Example: theater city • Relation Bookings with attributes: – title, the name of the performance title city theater – theater, the name of theater where the performance is being shown – city, where theater is located • FDs are: theater city, title city theater • Is there a BCNF violation? Yes: (theater city) because theater is not a superkey Note: keys here are: (title, city) and (theater, title) BCNF Decomposition: Bookings 1(theater, city) Bookings 2(title, theater) 11
Informal Presentation • Motivation Bookings(title, theater, city) theater city title city theater – Decomposition to get to BCNF may not always be desirable • BCNF decomposition is not dependency preserving, and • Efficient checking for FD violation on updates is important – 3 NF relaxes BCNF to allow relations that cannot be decomposed into BCNF relations without losing ability to check each FD • Informal Definition of 3 NF – A relation R is in third normal form if: As for BCNF Whenever A B is a nontrivial FD: either A is a superkey or B is a member of some candidate key 12
Informal Presentation • Informal Definition of 3 NF – A relation R is in third normal form if: Whenever A B is a nontrivial FD: either A is a superkey or B is a member of some candidate key • The difference between BCNF and 3 NF: – “B is a member of some candidate key” – Previous example schema is in 3 NF • Candidate keys here are: (title, city) and (theater, title) • Theater is not a superkey but city is a member of a candidate key Bookings(title, theater, city) • What is the problem with this schema? theater city title city theater 13
Informal Presentation • Informal Definition of 3 NF – Previous example schema is in 3 NF • What is the problem with this schema? – The schema contains redundant information Title Theater City Cats Broadway New York Phantom of the Opera Imperial London Cats New Theater London Beethoven’s 5 th Symphony Imperial London Bookings(title, theater, city) theater city title city theater 14
Formal Definition 3 NF • Definition – A relation schema R is in third normal form (3 NF) if BCNF Conditions • for all functional dependencies in F+ of the form , where R and R, at least one of the following holds: is a trivial functional dependency ( ) contains a key for R – every B is part of some candidate key of R • BCNF and 3 NF – A BCNF relation is in 3 NF – A 3 NF relation is not necessary in BCNF 15
Formal Presentation • Example – Consider the two relational schemas • R 1 = (cust-num, name, house-num, street, city, state) cust-num name, house-num, street, city, state • R 2 = (house-num, street, city, state, zip) house-num, street, city, state zip state – Are these relations in 3 NF? 16
Formal Presentation • Example in 3 NF? – For R 1= (cust-num, name, house-num, street, city, state) cust-num name, house-num, street, city, state • The only nontrivial functional dependencies in F+ are those with cust-num as a member of the left-side of the FD • As cust-num is a superkey of R 1, these functional dependencies satisfy the secondition for 3 NF Three conditions for 3 NF: • is a trivial functional dependency ( ) • contains a key for R • Every B is part of some candidate key of R 17
Formal Presentation • Example in 3 NF? – For R 2 Three conditions for 3 NF: • is a trivial functional dependency • contains a key for R • Every B is part of some candidate key of R R 2 = (house-num, street, city, state, zip) house-num, street, city, state zip state • There are two kinds of nontrivial functional dependencies in F+: – Those with (house-num, street, city, state) as a subset of the left hand side of the FD: As (house-num, street, city, state) is a superkey for R 2, these functional dependencies satisfy the secondition for 3 NF – Those of the form {zip} {state} where For any such functional dependency: ( {state}) – ( {zip}) = {state} (or = ) Because state is part of a candidate key of R 2, such functional dependencies satisfy the third condition for 3 NF 18
Decomposition into 3 NF • Principles – Input/Output • Input – A set of functional dependencies F – A relation schema R • Output – A lossless-join, dependency-preserving decomposition in 3 NF – Canonical Cover • The set of dependencies Fc in the algorithm is a canonical cover of the functional dependencies 19
Fc definition a canonical cover Fc for F is a set of dependencies Fc for which: • Fc <=> F • no fd in Fc is superfluous • no fd in Fc contains extraneous attrs • each left side of fd in Fc is unique 20
Extraneous attribute A in α→β in R • A α; F => F – {α→β} {(α-A)→β} • A β; F – {α→β} {α→(β -A)} => F • Computed via attribute closures 21
Fc computation algorithm Fc = F Repeat apply union rule (right side of fd) find fd with extraneous attrs (left/right side) & delete these Until Fc doesn’t change 22
Decomposition into 3 NF • Principles – The algorithm takes a set of dependencies and adds one schema at a time, instead of decomposing the initial schema repeatedly – The result is not uniquely defined since • A set of functional dependencies can have more than one canonical cover • In some cases, the result of the algorithm depends on the order which it considers the dependencies in Fc (minor bug in the algorithm, see later) 23
Decomposition into 3 NF • Decomposition – Given: relation R, set F of functional dependencies – Find: decomposition of R into a set of 3 NF relation Ri – Algorithm (sketch, real algorithm on next slides): (1) Eliminate redundant fd, resulting in a canonical cover Fc of F (2) Create a relation Ri = XY for each FD X Y in Fc (3) If the key K of R does not occur in any relation Ri, create one more relation Ri=K – Decomposition produces a lossless join and preserves dependencies – Prove ! 24
Decomposition Algorithm into 3 NF Let Fc be the canonical cover of F; j = 0; for each dependency α β in Fc if none of schemes in Ri (i=1, 2, …, j) contains αβ then j = j+1; Rj = αβ; end-if if any of the schemes in Ri (i=1, 2, …, j-1) is contained in Rj remove Ri end-if end-for if none of the schemes Ri (i=1, 2, …, j) contains a candidate key for R then j = j + 1; Rj = any candidate key for R; end-if return (R 1, R 2, …, Rj) 25
Decomposition into 3 NF • Example – Semester database of a university R=(L, I, T, R, S, G) – Relational schema R=(L, I, T, R, S, G) L I, TR L, TI R, LS G, TS R, TRI LR – Attributes • L: Lecture R: Room • I: Instructor S: Student • G: Grade T: Time – Functional Dependencies • L I, TR L, TI R, LS G, TS R, TRI LR 26
Decomposition into 3 NF • Example – R=(L, I, T, R, S, G) – F: {L I, TR L, TI R, LS G, TS R, TRI LR} – Are all FDs necessary? No ! • TR L, TI R then TRI LR (1) Eliminate redundant FD, resulting in a canonical cover Fc of F – Canonical cover of F • Fc= {L I, TR L, TI R, TS R, LS G} – Key: (ST) – Key attributes: S, T 27
Decomposition into 3 NF • Example – R = (L, I, T, R, S, G) – Fc = {L I, TR L, TI R, TS R, LS G} (2) Create a relation Ri = XA for – Key attributes: S, T each FD X A in Fc – Decomposition in 3 NF • R 1 = (L, I) R 2 = (T, R, L) • R 3 = (T, I, R) R 4 = (L, S, G) • R 5 = (S, T, R) (3) If the key K of R does not occur in any relation Ri, create one more relation Ri=K, but it does. 28
Decomposition into 3 NF • 3 NF Decomposition Algorithm – Proof of Correctness 3 NF decomposition algorithm is lossless join, dependency preserving decomposition into 3 NF 1. Dependency preserving 2. Lossless join 3. 3 NF 29
Proof: Decomposition into 3 NF is dependency preserving • 3 NF Decomposition Algorithm – Decomposition is dependency preserving • 3 NF decomposition algorithm is dependency preserving since there is a relation for every FD in Fc. 30
Proof: Decomposition into 3 NF is a lossless join • 3 NF Decomposition Algorithm – Decomposition is lossless join • Lossless join decomposition – A decomposition {R 1, R 2} is a lossless-join decomposition if R 1 R 2 R 1 or R 1 R 2 • Idea: – A candidate key (K) is in one of the relations Ri in decomposition (last step of algorithm guarantees this) – Closure of candidate key under Fc must contain all attributes in R (definition of candidate key) – Follow the steps of attribute closure algorithm (Fig. 7. 9) to show that the sufficient lossless join condition is satisfied for K+. 31
Proof: Decomposition into 3 NF is actually 3 NF! • 3 NF Decomposition Algorithm – Decomposition into 3 NF • Claim – If a relation Ri is in the decomposition generated by the synthesis algorithm, then Ri is in 3 NF • Idea – To test for 3 NF, it is sufficient to consider the functional dependencies whose right-hand side is a single attribute – Therefore to see that Ri is in 3 NF, we must show that any functional dependency B that holds in Ri, satisfies the definition of 3 NF 32
Proof: Decomposition into 3 NF is actually 3 NF! • 3 NF Decomposition Algorithm – Decomposition into 3 NF • Demonstration – Assume is the dependency that generated Ri in the algorithm – B must be in or , since B is in Ri and generated Ri – Let us consider two possible cases » B is in but not 33
Proof: 3 NF Decomposition is 3 NF! Three conditions for 3 NF: • is a trivial functional dependency • contains a key for R • Every B is part of some candidate key of R • 3 NF Decomposition Algorithm – Decomposition into 3 NF • Demonstration – B is in but not in : must be superkey (why? ) » The secondition of 3 NF is satisfied – B is in but not in is a candidate key » The third alternative in the definition of 3 NF is satisfied » Note: we cannot show that is a superkey. This shows exactly why the third alternative is present in the definition of 3 NF 34
Decomposition into 3 NF B: FD in R : FD that was used to generated Ri • B is in Assume is not a superkey must contain some attribute not in 1. Since B is in F+ it must be derivable from Fc, by using attribute closure on 2. Attribute closure cannot have used if it had been used, must be contained in the attribute closure of , which is not possible, since we assumed is not a superkey 3. Now, using ( - {B}) and B, we can derive B (since , and B since B is non-trivial) 4. Then, B is extraneous in the right-hand side of ; which is not possible since is in Fc (contradiction!) 5. Thus, if B is in then must be a superkey 35
Comparison of BCNF and 3 NF • BCNF or 3 NF? – Relations in BCNF and 3 NF • Relations in BCNF: no repetition of information • Relations in 3 NF: problem of repetition of information – Decomposition in BCNF and in 3 NF • It is always possible to decompose a relation into relations in 3 NF and – the decomposition is lossless – dependencies are preserved • It is always possible to decompose a relation into relations in BCNF and – the decomposition is lossless – the information is not repeated 36
Compare BCNF and 3 NF • To summarize – Design Goals • Goal for a relational database design is: – BCNF (no redundant information) – Lossless join – Dependency preservation • If we cannot achieve this, we accept: – 3 NF (possible repetition of information) – Lossless join – Dependency preservation 37
Summary • We have learned: – LLJ – DP – BCNF + algorithm – 3 rd NF + algorithm 38
… to follow Relational Algebra, revisited 39
Questions? 40
0872c6500a7c5e0cc3b42c485fc6ff30.ppt