CoLogNET Workshop on Logic-Based Methods for Information Integration - 23 August 2003 Integrity checking for combined databases Davide Martinenghi Computer Science, building 42.1 Roskilde University Universitetsvej 1 P.O. Box 260 DK-4000 Roskilde Denmark Phone: +45 4674 2000 Fax: +45 4674 3072 www.dat.ruc.dk
Content • Description of the problem • A simplification framework • GaV and LaV mappings • Application to data integration • Examples CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 2 Integration - 23 August 2003
Description of the problem • ICs are properties of the DB that must always hold • The integrity must be checked wrt the ICs after every update (typically tested in an ad hoc way at the application level) • In a data integration system, it’s the same • Idea: generate specialized versions of the ICs to be automatically executed • For expected kinds of updates • Assuming the integrity before the update • Generalize this technique to data integration systems CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 3 Integration - 23 August 2003
A condition A simplification framework about the updated state that can be 1. Produce a weakest precondition checked in Ex: ϕ = ← p(x) U=p(a) the present state After U ( ϕ )= ← (p(x) ∨ x=a) 2. Use the fact that ϕ was known to hold before the update (Cond. Weak. Prec.). 3. Take the weakest CWP. DEF: Simp U ( ϕ ) =Weaken ϕ (After U ( ϕ )) CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 4 Integration - 23 August 2003
A simplification framework - Example ϕ = ← m(x, y) ∧ m(x, z) ∧ y ≠ z checked by posing it as a query against the DB and expecting an empty answer U = m(Bob, Alice) simp U ( ϕ ) = ← m(Bob, y) ∧ y ≠ Alice CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 5 Integration - 23 August 2003
Mappings • Mapping = a way to associate n local DBs to a global DB • GaV mapping = the global DB is expressed as a set of views over the local sources. • LaV mapping = the local DBs are expressed as a set of views over the global DB. • We assume: • sound mappings (the views produce only but not necessarily all correct information) • no existential quantifier in LaV mappings • ⇒ LaV mappings can be rewritten as GaV mappings without skolemization CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 6 Integration - 23 August 2003
Mappings - example LaV mapping L = { m 1 (x, y) → m(x, y) ∧ n(x, it), m 2 (x, y) → m(x, y) ∧ n(x, dk) } GaV mapping M L = { m(x, y) ← m 1 (x, y), m(x, y) ← m 2 (x, y), n(x, y) ← m 1 (x, z) ∧ y=it, n(x, y) ← m 2 (x, z) ∧ y=dk } CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 7 Integration - 23 August 2003
Application to data integration A condition about the • After M ( ϕ ) is a weakest precondition global DB that can be M is a GaV mapping checked on • Simp O ∆ ( ϕ ) =Weaken ∆ (After O ( ϕ )) the local DBs Conditions to Conditions check globally known to hold locally CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 8 Integration - 23 August 2003
Example 1 ϕ = ← m(x, y) ∧ m(x, z) ∧ y ≠ z ϕ 1 = ← m 1 (x, y) ∧ m 1 (x, z) ∧ y ≠ z ϕ 2 = ← m 2 (x, y) ∧ m 2 (x, z) ∧ y ≠ z Check: { ← m 1 (x, y) ∧ m 1 (x, z) ∧ y ≠ z, ← m 1 (x, y) ∧ m 2 (x, z) ∧ y ≠ z, ← m 2 (x, y) ∧ m 1 (x, z) ∧ y ≠ z, ← m 2 (x, y) ∧ m 2 (x, z) ∧ y ≠ z } … CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 9 Integration - 23 August 2003
Example 1 ϕ = ← m(x, y) ∧ m(x, z) ∧ y ≠ z ϕ 1 = ← m 1 (x, y) ∧ m 1 (x, z) ∧ y ≠ z ϕ 2 = ← m 2 (x, y) ∧ m 2 (x, z) ∧ y ≠ z Check: { ← m 1 (x, y) ∧ m 1 (x, z) ∧ y ≠ z, ← m 1 (x, y) ∧ m 2 (x, z) ∧ y ≠ z, ← m 2 (x, y) ∧ m 1 (x, z) ∧ y ≠ z, ← m 2 (x, y) ∧ m 2 (x, z) ∧ y ≠ z } = Simp M ϕ 1 ∧ ϕ 2 ( ϕ ) … CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 10 Integration - 23 August 2003
Example 1 ϕ = ← m(x, y) ∧ m(x, z) ∧ y ≠ z ϕ 1 = ← m 1 (x, y) ∧ m 1 (x, z) ∧ y ≠ z ϕ 2 = ← m 2 (x, y) ∧ m 2 (x, z) ∧ y ≠ z Check: { ← m 1 (x, y) ∧ m 1 (x, z) ∧ y ≠ z, ← m 1 (x, y) ∧ m 2 (x, z) ∧ y ≠ z, ← m 2 (x, y) ∧ m 1 (x, z) ∧ y ≠ z, ← m 2 (x, y) ∧ m 2 (x, z) ∧ y ≠ z } = Simp M ϕ 1 ∧ ϕ 2 ∧ ϕ 1,2 ( ϕ ) If we knew ϕ 1,2 = ← m 1 (x, y) ∧ m 2 (x, z) CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 11 Integration - 23 August 2003
Example 2 M={ f(i,t,r) ← m(i,t,y) ∧ r(i,r)} ϕ 1 ={ ← m(i,t 1 ,y 1 ) ∧ m(i,t 2 ,y 2 ) ∧ t 1 ≠ t 2 , ← m(i,t 1 ,y 1 ) ∧ m(i,t 2 ,y 2 ) ∧ y 1 ≠ y 2 } ϕ 1,2 = ← r(i, r) ∧ ¬ m(i,t,y) ϕ ={ ← f(i,t 1 ,r 1 ) ∧ f(i,t 2 ,r 2 ) ∧ t 1 ≠ t 2 , ← f(i,t 1 ,r 1 ) ∧ f(i,t 2 ,r 2 ) ∧ r 1 ≠ r 2 } Global check: { ← m(i,t 1 ,y 1 ) ∧ r(i, r 1 ) ∧ m(i,t 2 ,y 2 ) ∧ r(i, r 2 ) ∧ t 1 ≠ t 2 , ← m(i,t 1 ,y 1 ) ∧ r(i, r 1 ) ∧ m(i,t 2 ,y 2 ) ∧ r(i, r 2 ) ∧ r 1 ≠ r 2 } … CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 12 Integration - 23 August 2003
Example 2 M={ f(i,t,r) ← m(i,t,y) ∧ r(i,r)} ϕ 1 ={ ← m(i,t 1 ,y 1 ) ∧ m(i,t 2 ,y 2 ) ∧ t 1 ≠ t 2 , ← m(i,t 1 ,y 1 ) ∧ m(i,t 2 ,y 2 ) ∧ y 1 ≠ y 2 } ϕ 1,2 = ← r(i, r) ∧ ¬ m(i,t,y) ϕ ={ ← f(i,t 1 ,r 1 ) ∧ f(i,t 2 ,r 2 ) ∧ t 1 ≠ t 2 , ← f(i,t 1 ,r 1 ) ∧ f(i,t 2 ,r 2 ) ∧ r 1 ≠ r 2 } Global check: { ← m(i,t 1 ,y 1 ) ∧ r(i, r 1 ) ∧ m(i,t 2 ,y 2 ) ∧ r(i, r 2 ) ∧ t 1 ≠ t 2 , ← m(i,t 1 ,y 1 ) ∧ r(i, r 1 ) ∧ m(i,t 2 ,y 2 ) ∧ r(i, r 2 ) ∧ r 1 ≠ r 2 } = Simp M ϕ 1 ∧ ϕ 1,2 ( ϕ ) CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 13 Integration - 23 August 2003
Summary • Express the data integration in terms of a GaV-mapping • Reformulate the condition to check in terms of the sources by calculating a weakest precondition wrt the mapping • Remove from it all conditions known to hold locally (plus possible cross-conditions) CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 14 Integration - 23 August 2003
References 1. H. Christiansen, D. Martinenghi Simplification of integrity constraints for data integration Submitted to FoIKS '04 2. H. Christiansen, D. Martinenghi Simplification of database integrity constraints revisited: A transformational approach Accepted for presentation at LOPSTR '03 http://www.dat.ruc.dk/~dm/publications CoLogNET Workshop on Logic-Based Methods for Information Davide Martinenghi 15 Integration - 23 August 2003
Recommend
More recommend