integrity checking for combined databases
play

Integrity checking for combined databases Davide Martinenghi - PowerPoint PPT Presentation

CoLogNET Workshop on Logic-Based Methods for Information Integration - 23 August 2003 Integrity checking for combined databases Davide Martinenghi Computer Science, building 42.1 Roskilde University Universitetsvej 1 P.O. Box 260 DK-4000


  1. CoLogNET Workshop on Logic-Based Methods for Information Integration - 23 August 2003 Integrity checking for combined databases Davide Martinenghi Computer Science, building 42.1 Roskilde University Universitetsvej 1 P.O. Box 260 DK-4000 Roskilde Denmark Phone: +45 4674 2000 Fax: +45 4674 3072 www.dat.ruc.dk

  2. Content • Description of the problem • A simplification framework • GaV and LaV mappings • Application to data integration • Examples CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 2 Integration - 23 August 2003

  3. Description of the problem • ICs are properties of the DB that must always hold • The integrity must be checked wrt the ICs after every update (typically tested in an ad hoc way at the application level) • In a data integration system, it’s the same • Idea: generate specialized versions of the ICs to be automatically executed • For expected kinds of updates • Assuming the integrity before the update • Generalize this technique to data integration systems CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 3 Integration - 23 August 2003

  4. A condition A simplification framework about the updated state that can be 1. Produce a weakest precondition checked in Ex: ϕ = ← p(x) U=p(a) the present state After U ( ϕ )= ← (p(x) ∨ x=a) 2. Use the fact that ϕ was known to hold before the update (Cond. Weak. Prec.). 3. Take the weakest CWP. DEF: Simp U ( ϕ ) =Weaken ϕ (After U ( ϕ )) CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 4 Integration - 23 August 2003

  5. A simplification framework - Example ϕ = ← m(x, y) ∧ m(x, z) ∧ y ≠ z checked by posing it as a query against the DB and expecting an empty answer U = m(Bob, Alice) simp U ( ϕ ) = ← m(Bob, y) ∧ y ≠ Alice CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 5 Integration - 23 August 2003

  6. Mappings • Mapping = a way to associate n local DBs to a global DB • GaV mapping = the global DB is expressed as a set of views over the local sources. • LaV mapping = the local DBs are expressed as a set of views over the global DB. • We assume: • sound mappings (the views produce only but not necessarily all correct information) • no existential quantifier in LaV mappings • ⇒ LaV mappings can be rewritten as GaV mappings without skolemization CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 6 Integration - 23 August 2003

  7. Mappings - example LaV mapping L = { m 1 (x, y) → m(x, y) ∧ n(x, it), m 2 (x, y) → m(x, y) ∧ n(x, dk) } GaV mapping M L = { m(x, y) ← m 1 (x, y), m(x, y) ← m 2 (x, y), n(x, y) ← m 1 (x, z) ∧ y=it, n(x, y) ← m 2 (x, z) ∧ y=dk } CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 7 Integration - 23 August 2003

  8. Application to data integration A condition about the • After M ( ϕ ) is a weakest precondition global DB that can be M is a GaV mapping checked on • Simp O ∆ ( ϕ ) =Weaken ∆ (After O ( ϕ )) the local DBs Conditions to Conditions check globally known to hold locally CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 8 Integration - 23 August 2003

  9. Example 1 ϕ = ← m(x, y) ∧ m(x, z) ∧ y ≠ z ϕ 1 = ← m 1 (x, y) ∧ m 1 (x, z) ∧ y ≠ z ϕ 2 = ← m 2 (x, y) ∧ m 2 (x, z) ∧ y ≠ z Check: { ← m 1 (x, y) ∧ m 1 (x, z) ∧ y ≠ z, ← m 1 (x, y) ∧ m 2 (x, z) ∧ y ≠ z, ← m 2 (x, y) ∧ m 1 (x, z) ∧ y ≠ z, ← m 2 (x, y) ∧ m 2 (x, z) ∧ y ≠ z } … CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 9 Integration - 23 August 2003

  10. Example 1 ϕ = ← m(x, y) ∧ m(x, z) ∧ y ≠ z ϕ 1 = ← m 1 (x, y) ∧ m 1 (x, z) ∧ y ≠ z ϕ 2 = ← m 2 (x, y) ∧ m 2 (x, z) ∧ y ≠ z Check: { ← m 1 (x, y) ∧ m 1 (x, z) ∧ y ≠ z, ← m 1 (x, y) ∧ m 2 (x, z) ∧ y ≠ z, ← m 2 (x, y) ∧ m 1 (x, z) ∧ y ≠ z, ← m 2 (x, y) ∧ m 2 (x, z) ∧ y ≠ z } = Simp M ϕ 1 ∧ ϕ 2 ( ϕ ) … CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 10 Integration - 23 August 2003

  11. Example 1 ϕ = ← m(x, y) ∧ m(x, z) ∧ y ≠ z ϕ 1 = ← m 1 (x, y) ∧ m 1 (x, z) ∧ y ≠ z ϕ 2 = ← m 2 (x, y) ∧ m 2 (x, z) ∧ y ≠ z Check: { ← m 1 (x, y) ∧ m 1 (x, z) ∧ y ≠ z, ← m 1 (x, y) ∧ m 2 (x, z) ∧ y ≠ z, ← m 2 (x, y) ∧ m 1 (x, z) ∧ y ≠ z, ← m 2 (x, y) ∧ m 2 (x, z) ∧ y ≠ z } = Simp M ϕ 1 ∧ ϕ 2 ∧ ϕ 1,2 ( ϕ ) If we knew ϕ 1,2 = ← m 1 (x, y) ∧ m 2 (x, z) CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 11 Integration - 23 August 2003

  12. Example 2 M={ f(i,t,r) ← m(i,t,y) ∧ r(i,r)} ϕ 1 ={ ← m(i,t 1 ,y 1 ) ∧ m(i,t 2 ,y 2 ) ∧ t 1 ≠ t 2 , ← m(i,t 1 ,y 1 ) ∧ m(i,t 2 ,y 2 ) ∧ y 1 ≠ y 2 } ϕ 1,2 = ← r(i, r) ∧ ¬ m(i,t,y) ϕ ={ ← f(i,t 1 ,r 1 ) ∧ f(i,t 2 ,r 2 ) ∧ t 1 ≠ t 2 , ← f(i,t 1 ,r 1 ) ∧ f(i,t 2 ,r 2 ) ∧ r 1 ≠ r 2 } Global check: { ← m(i,t 1 ,y 1 ) ∧ r(i, r 1 ) ∧ m(i,t 2 ,y 2 ) ∧ r(i, r 2 ) ∧ t 1 ≠ t 2 , ← m(i,t 1 ,y 1 ) ∧ r(i, r 1 ) ∧ m(i,t 2 ,y 2 ) ∧ r(i, r 2 ) ∧ r 1 ≠ r 2 } … CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 12 Integration - 23 August 2003

  13. Example 2 M={ f(i,t,r) ← m(i,t,y) ∧ r(i,r)} ϕ 1 ={ ← m(i,t 1 ,y 1 ) ∧ m(i,t 2 ,y 2 ) ∧ t 1 ≠ t 2 , ← m(i,t 1 ,y 1 ) ∧ m(i,t 2 ,y 2 ) ∧ y 1 ≠ y 2 } ϕ 1,2 = ← r(i, r) ∧ ¬ m(i,t,y) ϕ ={ ← f(i,t 1 ,r 1 ) ∧ f(i,t 2 ,r 2 ) ∧ t 1 ≠ t 2 , ← f(i,t 1 ,r 1 ) ∧ f(i,t 2 ,r 2 ) ∧ r 1 ≠ r 2 } Global check: { ← m(i,t 1 ,y 1 ) ∧ r(i, r 1 ) ∧ m(i,t 2 ,y 2 ) ∧ r(i, r 2 ) ∧ t 1 ≠ t 2 , ← m(i,t 1 ,y 1 ) ∧ r(i, r 1 ) ∧ m(i,t 2 ,y 2 ) ∧ r(i, r 2 ) ∧ r 1 ≠ r 2 } = Simp M ϕ 1 ∧ ϕ 1,2 ( ϕ ) CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 13 Integration - 23 August 2003

  14. Summary • Express the data integration in terms of a GaV-mapping • Reformulate the condition to check in terms of the sources by calculating a weakest precondition wrt the mapping • Remove from it all conditions known to hold locally (plus possible cross-conditions) CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 14 Integration - 23 August 2003

  15. References 1. H. Christiansen, D. Martinenghi Simplification of integrity constraints for data integration Submitted to FoIKS '04 2. H. Christiansen, D. Martinenghi Simplification of database integrity constraints revisited: A transformational approach Accepted for presentation at LOPSTR '03 http://www.dat.ruc.dk/~dm/publications CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 15 Integration - 23 August 2003

Recommend


More recommend