analyzing comparing and debugging schema mappings
play

Analyzing, Comparing and Debugging Schema Mappings Emanuel - PowerPoint PPT Presentation

Analyzing, Comparing and Debugging Schema Mappings Emanuel Sallinger Vienna University of Technology Institute of Information Systems Database and Artificial Intelligence Group DEIS10 11 November, 2010 Emanuel Sallinger DEIS10 11


  1. Debugging with Routes σ 2 , h σ 3 , h ′ I , ∅ I , { t 4 } I , { t 4 , t 2 } s 2 : SuppCards (6689 , 234 , A . Long , California) σ 2 : SuppCards ( an , s , n , a ) → ∃ M , I Clients ( s , n , M , I , a ) t 4 : Clients (234 , A . Long , M 1 , I 1 , California) σ 4 : Clients ( s , n , m , i , a ) → ∃ N , L Accounts ( N , L , s ) t 2 : Accounts ( N 1 , 50K , 234) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 14

  2. Debugging with Routes σ 2 , h σ 3 , h ′ I , ∅ I , { t 4 } I , { t 4 , t 2 } s 2 : SuppCards (6689 , 234 , A . Long , California) σ 2 : SuppCards ( an , s , n , a ) → ∃ M , I Clients ( s , n , M , I , a ) t 4 : Clients (234 , A . Long , M 1 , I 1 , California) σ 4 : Clients ( s , n , m , i , a ) → ∃ N , L Accounts ( N , L , s ) t 2 : Accounts ( N 1 , 50K , 234) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 14

  3. Debugging with Routes σ 2 , h σ 3 , h ′ I , ∅ I , { t 4 } I , { t 4 , t 2 } s 1 : Cards (6689 , 15K , 434 , J . Long , Smith , 50K , Seattle) s 2 : SuppCards (6689 , 234 , A . Long , California) σ ′ 2 : Cards ( cn , l , s 1 , n 1 , m , sal , loc ) ∧ SuppCards ( cn , s 2 , n 2 , a ) → ∃ M , I ( Clients ( s 2 , n 2 , M , I , a ) ∧ Accounts ( cn , l , s 2 ) t ′ 2 : Accounts (6689 , 15K , 234) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 15

  4. Computing Routes In general, a single route is not sufficient for analyzing and debugging. Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 16

  5. Computing Routes In general, a single route is not sufficient for analyzing and debugging. Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 16

  6. Computing Routes In general, a single route is not sufficient for analyzing and debugging. Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 16

  7. Computing Routes In general, a single route is not sufficient for analyzing and debugging. Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 16

  8. Computing Routes In general, a single route is not sufficient for analyzing and debugging. Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 16

  9. Computing Routes In general, a single route is not sufficient for analyzing and debugging. Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 16

  10. Computing Routes In general, a single route is not sufficient for analyzing and debugging. Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 16

  11. Computing Routes σ 1 , h { s 1 } , ∅ { s 1 } , { t 1 , t 3 } s 1 : Cards (6689 , 15K , 434 , J. Long , Smith , 50K , Seattle) σ 1 : Cards ( cn , l , s , n , m , sal , loc ) → ∃ A ( Accounts ( cn , l , s ) ∧ Clients ( s , m , m , sal , A ) t 1 : Accounts (6689 , 15K , 434) t 3 : Clients (434 , Smith , Smith , 50K , A 1 ) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 17

  12. Computing Routes s 1 : Cards (6689 , 15K , 434 , J. Long , Smith , 50K , Seattle) σ 1 : Cards ( cn , l , s , n , m , sal , loc ) → ∃ A ( Accounts ( cn , l , s ) ∧ Clients ( s , m , m , sal , A ) t 1 : Accounts (6689 , 15K , 434) t 3 : Clients (434 , Smith , Smith , 50K , A 1 ) h : Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 18

  13. Computing Routes s 1 : Cards (6689 , 15K , 434 , J. Long , Smith , 50K , Seattle) σ 1 : Cards ( cn , l , s , n , m , sal , loc ) → ∃ A ( Accounts ( cn , l , s ) ∧ Clients ( s , m , m , sal , A ) t 1 : Accounts (6689 , 15K , 434) t 3 : Clients (434 , Smith , Smith , 50K , A 1 ) h : { cn �→ 6689 , l �→ 15K , s �→ 434 , 1 Map an atom from ψ ( � x ,� y ) to t . Add it to h . Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 18

  14. Computing Routes s 1 : Cards (6689 , 15K , 434 , J. Long , Smith , 50K , Seattle) σ 1 : Cards ( cn , l , s , n , m , sal , loc ) → ∃ A ( Accounts ( cn , l , s ) ∧ Clients ( s , m , m , sal , A ) t 1 : Accounts (6689 , 15K , 434) t 3 : Clients (434 , Smith , Smith , 50K , A 1 ) h : { cn �→ 6689 , l �→ 15K , s �→ 434 , 1 Map an atom from ψ ( � x ,� y ) to t . Add it to h . Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 18

  15. Computing Routes s 1 : Cards (6689 , 15K , 434 , J. Long , Smith , 50K , Seattle) σ 1 : Cards ( cn , l , s , n , m , sal , loc ) → ∃ A ( Accounts ( cn , l , s ) ∧ Clients ( s , m , m , sal , A ) t 1 : Accounts (6689 , 15K , 434) t 3 : Clients (434 , Smith , Smith , 50K , A 1 ) h : { cn �→ 6689 , l �→ 15K , s �→ 434 , n �→ J . Long , m �→ Smith , sal �→ 50K , loc �→ Seattle , x ) h to I / J . Add it to h . 2 Map ϕ ( � Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 19

  16. Computing Routes s 1 : Cards (6689 , 15K , 434 , J. Long , Smith , 50K , Seattle) σ 1 : Cards ( cn , l , s , n , m , sal , loc ) → ∃ A ( Accounts ( cn , l , s ) ∧ Clients ( s , m , m , sal , A ) t 1 : Accounts (6689 , 15K , 434) t 3 : Clients (434 , Smith , Smith , 50K , A 1 ) h : { cn �→ 6689 , l �→ 15K , s �→ 434 , n �→ J . Long , m �→ Smith , sal �→ 50K , loc �→ Seattle , x ) h to I / J . Add it to h . 2 Map ϕ ( � Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 19

  17. Computing Routes s 1 : Cards (6689 , 15K , 434 , J. Long , Smith , 50K , Seattle) σ 1 : Cards ( cn , l , s , n , m , sal , loc ) → ∃ A ( Accounts ( cn , l , s ) ∧ Clients ( s , m , m , sal , A ) t 1 : Accounts (6689 , 15K , 434) t 3 : Clients (434 , Smith , Smith , 50K , A 1 ) h : { cn �→ 6689 , l �→ 15K , s �→ 434 , n �→ J . Long , m �→ Smith , sal �→ 50K , loc �→ Seattle , A �→ A 1 } y ) h to J . Add it to h . 3 Map ψ ( � x ,� Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 20

  18. Computing Routes s 1 : Cards (6689 , 15K , 434 , J. Long , Smith , 50K , Seattle) σ 1 : Cards ( cn , l , s , n , m , sal , loc ) → ∃ A ( Accounts ( cn , l , s ) ∧ Clients ( s , m , m , sal , A ) t 1 : Accounts (6689 , 15K , 434) t 3 : Clients (434 , Smith , Smith , 50K , A 1 ) h : { cn �→ 6689 , l �→ 15K , s �→ 434 , n �→ J . Long , m �→ Smith , sal �→ 50K , loc �→ Seattle , A �→ A 1 } 4 return h Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 21

  19. Computing Routes Algorithm [CT06] (sketch) FindHom ( I , J , t , σ ) 1 Map an atom from ψ ( � x ,� y ) to t . Add it to h . x ) h to I / J . Add it to h . 2 Map ϕ ( � y ) h to J . Add it to h . 3 Map ψ ( � x ,� 4 return h Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 22

  20. Outline Σ S T Analyzing and Debugging – Debugging with Routes – Computing Routes Optimizing with Logical Equivalence Optimizing with Relaxed Notions of Equivalence Comparing Schema Mappings Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 23

  21. Outline Σ S T Analyzing and Debugging – Debugging with Routes – Computing Routes Optimizing with Logical Equivalence Optimizing with Relaxed Notions of Equivalence Comparing Schema Mappings Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 23

  22. Comparing and Optimizing Optimization Finding a “better” schema mapping that is still “equivalent”. Σ Σ ′ S T S T Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 24

  23. Comparing and Optimizing Optimization Finding a “better” schema mapping that is still “equivalent”. Σ Σ ′ ≡ log S T S T Definition [FKNP08] M and M ′ are logically equivalent , if for every instance ( I , J ) ( I , J ) � M ⇔ ( I , J ) � M ′ Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 24

  24. Optimality Criteria Σ S T Course ( title , prof - area ) Lecture ( title , year , prof ) Equal-Year ( course 1 , course 2) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 25

  25. Optimality Criteria Σ S T Course ( title , prof - area ) Lecture ( title , year , prof ) Equal-Year ( course 1 , course 2) Example L ( x 1 , x 2 , x 3 ) → ∃ y 1 , y 2 C ( y 1 , y 2 ) ∧ C ( x 1 , y 2 ) L ( x 1 , x 2 , x 3 ) ∧ L ( x 4 , x 5 , x 6 ) → ∃ y 2 C ( x 1 , y 2 ) Optimality Criteria Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 25

  26. Optimality Criteria Σ S T Course ( title , prof - area ) Lecture ( title , year , prof ) Equal-Year ( course 1 , course 2) Example L ( x 1 , x 2 , x 3 ) → ∃ y 1 , y 2 C ( y 1 , y 2 ) ∧ C ( x 1 , y 2 ) L ( x 1 , x 2 , x 3 ) ∧ L ( x 4 , x 5 , x 6 ) → ∃ y 2 C ( x 1 , y 2 ) Optimality Criteria Minimize the number of atoms in each conclusion Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 25

  27. Optimality Criteria Σ S T Course ( title , prof - area ) Lecture ( title , year , prof ) Equal-Year ( course 1 , course 2) Example L ( x 1 , x 2 , x 3 ) → ∃ y 1 , y 2 C ( y 1 , y 2 ) ∧ C ( x 1 , y 2 ) L ( x 1 , x 2 , x 3 ) ∧ L ( x 4 , x 5 , x 6 ) → ∃ y 2 C ( x 1 , y 2 ) Optimality Criteria Minimize the number of atoms in each conclusion Minimize the number of existentially quantified variables Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 25

  28. Optimality Criteria Σ S T Course ( title , prof - area ) Lecture ( title , year , prof ) Equal-Year ( course 1 , course 2) Example L ( x 1 , x 2 , x 3 ) → ∃ y 1 , y 2 C ( y 1 , y 2 ) ∧ C ( x 1 , y 2 ) L ( x 1 , x 2 , x 3 ) ∧ L ( x 4 , x 5 , x 6 ) → ∃ y 2 C ( x 1 , y 2 ) Optimality Criteria Minimize the number of atoms in each conclusion Minimize the number of existentially quantified variables Minimize the number of atoms in each antecedent Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 25

  29. Optimality Criteria Σ S T Course ( title , prof - area ) Lecture ( title , year , prof ) Equal-Year ( course 1 , course 2) Example L ( x 1 , x 2 , x 3 ) → ∃ y 1 , y 2 C ( y 1 , y 2 ) ∧ C ( x 1 , y 2 ) L ( x 1 , x 2 , x 3 ) ∧ L ( x 4 , x 5 , x 6 ) → ∃ y 2 C ( x 1 , y 2 ) Optimality Criteria Minimize the number of atoms in each conclusion Minimize the number of existentially quantified variables Minimize the number of atoms in each antecedent Minimize the number of dependencies Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 25

  30. Optimality Criteria Σ S T Course ( title , prof - area ) Lecture ( title , year , prof ) Equal-Year ( course 1 , course 2) Example Consider the set Σ of s-t tgds: L ( x 1 , x 2 , x 3 ) → ∃ y C ( x 1 , y ) L ( x 1 , x 2 , x 3 ) ∧ L ( x 4 , x 2 , x 5 ) → E ( x 1 , x 4 ) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 26

  31. Optimality Criteria Σ S T Course ( title , prof - area ) Lecture ( title , year , prof ) Equal-Year ( course 1 , course 2) Example Consider the set Σ of s-t tgds: L ( x 1 , x 2 , x 3 ) → ∃ y C ( x 1 , y ) L ( x 1 , x 2 , x 3 ) ∧ L ( x 4 , x 2 , x 5 ) → E ( x 1 , x 4 ) Equivalent set of s-t tgds Σ ′ : L ( x 1 , x 2 , x 3 ) ∧ L ( x 4 , x 2 , x 5 ) → ∃ y C ( x 1 , y ) ∧ E ( x 1 , x 4 ) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 26

  32. Example (continued) Consider the set Σ of s-t tgds: L ( x 1 , x 2 , x 3 ) → ∃ y C ( x 1 , y ) L ( x 1 , x 2 , x 3 ) ∧ L ( x 4 , x 2 , x 5 ) → E ( x 1 , x 4 ) Equivalent set of s-t tgds Σ ′ : L ( x 1 , x 2 , x 3 ) ∧ L ( x 4 , x 2 , x 5 ) → ∃ y C ( x 1 , y ) ∧ E ( x 1 , x 4 ) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 27

  33. Example (continued) Consider the set Σ of s-t tgds: L ( x 1 , x 2 , x 3 ) → ∃ y C ( x 1 , y ) L ( x 1 , x 2 , x 3 ) ∧ L ( x 4 , x 2 , x 5 ) → E ( x 1 , x 4 ) Equivalent set of s-t tgds Σ ′ : L ( x 1 , x 2 , x 3 ) ∧ L ( x 4 , x 2 , x 5 ) → ∃ y C ( x 1 , y ) ∧ E ( x 1 , x 4 ) Observation Canonical universal solution: for Σ: one tuple in the C -relation per tuple in the L -relation for Σ ′ : in total, quadratically many tuples in the C -relation Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 27

  34. Example (continued) Consider the set Σ of s-t tgds: L ( x 1 , x 2 , x 3 ) → ∃ y C ( x 1 , y ) L ( x 1 , x 2 , x 3 ) ∧ L ( x 4 , x 2 , x 5 ) → E ( x 1 , x 4 ) Equivalent set of s-t tgds Σ ′ : L ( x 1 , x 2 , x 3 ) ∧ L ( x 4 , x 2 , x 5 ) → ∃ y C ( x 1 , y ) ∧ E ( x 1 , x 4 ) Observation Canonical universal solution: for Σ: one tuple in the C -relation per tuple in the L -relation for Σ ′ : in total, quadratically many tuples in the C -relation Optimality Criteria Splitting should be applied whenever possible. Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 27

  35. Optimality Criteria Optimality Criteria Splitting: Splitting should be applied whenever possible. Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 28

  36. Optimality Criteria Optimality Criteria Splitting: Splitting should be applied whenever possible. Optimization goals: cardinality-minimality: the number of dependencies shall be minimal antecedent-minimality: the total size of the antecedents shall be minimal conclusion-minimality: the total size of the conclusions shall be minimal variable-minimality: the total number of existentially quantified variables shall be minimal Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 28

  37. Optimizing Schema Mappings Rewrite System for s-t tgds [GPS09] 1 Simplification of the conclusion (core computation) 2 Simplification of the antecedent (core computation) 3 Splitting 4 Deletion of an s-t tgd (implication test) 5 Simplification of the conclusion using other tgds (implication test) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 29

  38. Optimizing Schema Mappings L ( x 1 , x 2 , x 3 ) → P ( x 1 , y 1 , 3) ∧ R ( y 1 , x 2 , 3) ∧ R ( y 1 , x 2 , y 2 ) L ( x 1 , x 1 , x 1 ) → P ( x 1 , y 1 , y 2 ) ∧ Q ( y 2 , y 3 , x 1 ) ∧ R ( y 1 , x 1 , y 2 ) L ( x 1 , x 2 , x 2 ) ∧ L ( x 1 , x 2 , x 3 ) → P ( x 1 , y 2 , y 1 ) ∧ Q ( y 1 , y 3 , x 2 ) ∧ Q (3 , y 3 , x 2 ) ∧ R ( x 2 , y 4 , x 3 ) L ( x 1 , x 2 , x 2 ) ∧ L ( x 1 , x 2 , x 3 ) → R ( x 2 , y 4 , x 3 ) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 30

  39. Optimizing Schema Mappings L ( x 1 , x 2 , x 3 ) → P ( x 1 , y 1 , 3) ∧ R ( y 1 , x 2 , 3) ∧ R ( y 1 , x 2 , y 2 ) L ( x 1 , x 1 , x 1 ) → P ( x 1 , y 1 , y 2 ) ∧ Q ( y 2 , y 3 , x 1 ) ∧ R ( y 1 , x 1 , y 2 ) L ( x 1 , x 2 , x 2 ) ∧ L ( x 1 , x 2 , x 3 ) → P ( x 1 , y 2 , y 1 ) ∧ Q ( y 1 , y 3 , x 2 ) ∧ Q (3 , y 3 , x 2 ) ∧ R ( x 2 , y 4 , x 3 ) L ( x 1 , x 2 , x 2 ) ∧ L ( x 1 , x 2 , x 3 ) → R ( x 2 , y 4 , x 3 ) 1 Simplification of the conclusion (core computation) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 30

  40. Optimizing Schema Mappings L ( x 1 , x 2 , x 3 ) → P ( x 1 , y 1 , 3) ∧ R ( y 1 , x 2 , 3) L ( x 1 , x 1 , x 1 ) → P ( x 1 , y 1 , y 2 ) ∧ Q ( y 2 , y 3 , x 1 ) ∧ R ( y 1 , x 1 , y 2 ) L ( x 1 , x 2 , x 2 ) ∧ L ( x 1 , x 2 , x 3 ) → P ( x 1 , y 2 , y 1 ) ∧ Q ( y 1 , y 3 , x 2 ) ∧ Q (3 , y 3 , x 2 ) ∧ R ( x 2 , y 4 , x 3 ) L ( x 1 , x 2 , x 2 ) ∧ L ( x 1 , x 2 , x 3 ) → R ( x 2 , y 4 , x 3 ) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 31

  41. Optimizing Schema Mappings L ( x 1 , x 2 , x 3 ) → P ( x 1 , y 1 , 3) ∧ R ( y 1 , x 2 , 3) L ( x 1 , x 1 , x 1 ) → P ( x 1 , y 1 , y 2 ) ∧ Q ( y 2 , y 3 , x 1 ) ∧ R ( y 1 , x 1 , y 2 ) L ( x 1 , x 2 , x 2 ) ∧ L ( x 1 , x 2 , x 3 ) → P ( x 1 , y 2 , y 1 ) ∧ Q ( y 1 , y 3 , x 2 ) ∧ Q (3 , y 3 , x 2 ) ∧ R ( x 2 , y 4 , x 3 ) L ( x 1 , x 2 , x 2 ) ∧ L ( x 1 , x 2 , x 3 ) → R ( x 2 , y 4 , x 3 ) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 31

  42. Optimizing Schema Mappings L ( x 1 , x 2 , x 3 ) → P ( x 1 , y 1 , 3) ∧ R ( y 1 , x 2 , 3) L ( x 1 , x 1 , x 1 ) → P ( x 1 , y 1 , y 2 ) ∧ Q ( y 2 , y 3 , x 1 ) ∧ R ( y 1 , x 1 , y 2 ) L ( x 1 , x 2 , x 2 ) ∧ L ( x 1 , x 2 , x 3 ) → P ( x 1 , y 2 , y 1 ) ∧ Q ( y 1 , y 3 , x 2 ) ∧ Q (3 , y 3 , x 2 ) ∧ R ( x 2 , y 4 , x 3 ) L ( x 1 , x 2 , x 2 ) ∧ L ( x 1 , x 2 , x 3 ) → R ( x 2 , y 4 , x 3 ) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 30

  43. Optimizing Schema Mappings L ( x 1 , x 2 , x 3 ) → P ( x 1 , y 1 , 3) ∧ R ( y 1 , x 2 , 3) L ( x 1 , x 1 , x 1 ) → P ( x 1 , y 1 , y 2 ) ∧ Q ( y 2 , y 3 , x 1 ) ∧ R ( y 1 , x 1 , y 2 ) L ( x 1 , x 2 , x 2 ) ∧ L ( x 1 , x 2 , x 3 ) → P ( x 1 , y 2 , y 1 ) ∧ Q ( y 1 , y 3 , x 2 ) ∧ Q (3 , y 3 , x 2 ) ∧ R ( x 2 , y 4 , x 3 ) L ( x 1 , x 2 , x 2 ) ∧ L ( x 1 , x 2 , x 3 ) → R ( x 2 , y 4 , x 3 ) 3 Splitting Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 30

  44. Optimizing Schema Mappings L ( x 1 , x 2 , x 3 ) → P ( x 1 , y 1 , 3) ∧ R ( y 1 , x 2 , 3) L ( x 1 , x 1 , x 1 ) → P ( x 1 , y 1 , y 2 ) ∧ Q ( y 2 , y 3 , x 1 ) ∧ R ( y 1 , x 1 , y 2 ) L ( x 1 , x 2 , x 2 ) ∧ L ( x 1 , x 2 , x 3 ) → P ( x 1 , y 2 , y 1 ) ∧ Q ( y 1 , y 3 , x 2 ) ∧ Q (3 , y 3 , x 2 ) L ( x 1 , x 2 , x 2 ) ∧ L ( x 1 , x 2 , x 3 ) → R ( x 2 , y 4 , x 3 ) 3 Splitting Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 30

  45. Optimizing Schema Mappings L ( x 1 , x 2 , x 3 ) → P ( x 1 , y 1 , 3) ∧ R ( y 1 , x 2 , 3) L ( x 1 , x 1 , x 1 ) → P ( x 1 , y 1 , y 2 ) ∧ Q ( y 2 , y 3 , x 1 ) ∧ R ( y 1 , x 1 , y 2 ) L ( x 1 , x 2 , x 2 ) ∧ L ( x 1 , x 2 , x 3 ) → P ( x 1 , y 2 , y 1 ) ∧ Q ( y 1 , y 3 , x 2 ) ∧ Q (3 , y 3 , x 2 ) L ( x 1 , x 2 , x 2 ) ∧ L ( x 1 , x 2 , x 3 ) → R ( x 2 , y 4 , x 3 ) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 31

  46. Optimizing Schema Mappings L ( x 1 , x 2 , x 3 ) → P ( x 1 , y 1 , 3) ∧ R ( y 1 , x 2 , 3) L ( x 1 , x 1 , x 1 ) → P ( x 1 , y 1 , y 2 ) ∧ Q ( y 2 , y 3 , x 1 ) ∧ R ( y 1 , x 1 , y 2 ) L ( x 1 , x 2 , x 2 ) ∧ L ( x 1 , x 2 , x 3 ) → P ( x 1 , y 2 , y 1 ) ∧ Q ( y 1 , y 3 , x 2 ) ∧ Q (3 , y 3 , x 2 ) L ( x 1 , x 2 , x 2 ) ∧ L ( x 1 , x 2 , x 3 ) → R ( x 2 , y 4 , x 3 ) 2 Simplification of the antecedent (core computation) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 31

  47. Optimizing Schema Mappings L ( x 1 , x 2 , x 3 ) → P ( x 1 , y 1 , 3) ∧ R ( y 1 , x 2 , 3) L ( x 1 , x 1 , x 1 ) → P ( x 1 , y 1 , y 2 ) ∧ Q ( y 2 , y 3 , x 1 ) ∧ R ( y 1 , x 1 , y 2 ) L ( x 1 , x 2 , x 2 ) → P ( x 1 , y 2 , y 1 ) ∧ Q ( y 1 , y 3 , x 2 ) ∧ Q (3 , y 3 , x 2 ) L ( x 1 , x 2 , x 2 ) ∧ L ( x 1 , x 2 , x 3 ) → R ( x 2 , y 4 , x 3 ) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 32

  48. Optimizing Schema Mappings L ( x 1 , x 2 , x 3 ) → P ( x 1 , y 1 , 3) ∧ R ( y 1 , x 2 , 3) L ( x 1 , x 1 , x 1 ) → P ( x 1 , y 1 , y 2 ) ∧ Q ( y 2 , y 3 , x 1 ) ∧ R ( y 1 , x 1 , y 2 ) L ( x 1 , x 2 , x 2 ) → P ( x 1 , y 2 , y 1 ) ∧ Q ( y 1 , y 3 , x 2 ) ∧ Q (3 , y 3 , x 2 ) L ( x 1 , x 2 , x 2 ) ∧ L ( x 1 , x 2 , x 3 ) → R ( x 2 , y 4 , x 3 ) 5 Simplification of the conclusion using other tgds (implication test) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 32

  49. Optimizing Schema Mappings L ( x 1 , x 2 , x 3 ) → P ( x 1 , y 1 , 3) ∧ R ( y 1 , x 2 , 3) L ( x 1 , x 1 , x 1 ) → P ( x 1 , y 1 , y 2 ) ∧ Q ( y 2 , y 3 , x 1 ) ∧ R ( y 1 , x 1 , y 2 ) L ( x 1 , x 2 , x 2 ) → Q (3 , y 3 , x 2 ) L ( x 1 , x 2 , x 2 ) ∧ L ( x 1 , x 2 , x 3 ) → R ( x 2 , y 4 , x 3 ) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 33

  50. Optimizing Schema Mappings L ( x 1 , x 2 , x 3 ) → P ( x 1 , y 1 , 3) ∧ R ( y 1 , x 2 , 3) L ( x 1 , x 1 , x 1 ) → P ( x 1 , y 1 , y 2 ) ∧ Q ( y 2 , y 3 , x 1 ) ∧ R ( y 1 , x 1 , y 2 ) L ( x 1 , x 2 , x 2 ) → Q (3 , y 3 , x 2 ) L ( x 1 , x 2 , x 2 ) ∧ L ( x 1 , x 2 , x 3 ) → R ( x 2 , y 4 , x 3 ) 4 Deletion of an s-t tgd (implication test) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 33

  51. Optimizing Schema Mappings L ( x 1 , x 2 , x 3 ) → P ( x 1 , y 1 , 3) ∧ R ( y 1 , x 2 , 3) L ( x 1 , x 1 , x 1 ) → P ( x 1 , y 1 , y 2 ) ∧ Q ( y 2 , y 3 , x 1 ) ∧ R ( y 1 , x 1 , y 2 ) L ( x 1 , x 2 , x 2 ) → Q (3 , y 3 , x 2 ) L ( x 1 , x 2 , x 2 ) ∧ L ( x 1 , x 2 , x 3 ) → R ( x 2 , y 4 , x 3 ) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 33

  52. Optimizing Schema Mappings L ( x 1 , x 2 , x 3 ) → P ( x 1 , y 1 , 3) ∧ R ( y 1 , x 2 , 3) L ( x 1 , x 2 , x 2 ) → Q (3 , y 3 , x 2 ) L ( x 1 , x 2 , x 2 ) ∧ L ( x 1 , x 2 , x 3 ) → R ( x 2 , y 4 , x 3 ) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 34

  53. Optimizing Schema Mappings L ( x 1 , x 2 , x 3 ) → P ( x 1 , y 1 , 3) ∧ R ( y 1 , x 2 , 3) L ( x 1 , x 2 , x 2 ) → Q (3 , y 3 , x 2 ) L ( x 1 , x 2 , x 2 ) ∧ L ( x 1 , x 2 , x 3 ) → R ( x 2 , y 4 , x 3 ) Result of rewrite system Is among all logically equivalent split-reduced mappings cardinality/antecedent/conclusion/variable-minimal Is a unique normal form Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 34

  54. Outline Σ S T Analyzing and Debugging – Debugging with Routes – Computing Routes Optimizing with Logical Equivalence – Optimality Criteria – Optimization and Normalization Optimizing with Relaxed Notions of Equivalence Comparing Schema Mappings Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 35

  55. Outline Σ S T Analyzing and Debugging – Debugging with Routes – Computing Routes Optimizing with Logical Equivalence – Optimality Criteria – Optimization and Normalization Optimizing with Relaxed Notions of Equivalence Comparing Schema Mappings Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 35

  56. Equivalence Σ Σ ′ ≡ log S T S T Definition [FKNP08] M and M ′ are logically equivalent , if for every source instance I Sol( I , M ) = Sol( I , M ′ ) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 36

  57. Equivalence Σ Σ ′ S T S T S ( x ) → T ( x ) S ( x ) → T ( x ) T ′ ( x , y ) → T ′ ( y , x ) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 37

  58. Equivalence Σ Σ ′ �≡ log S T S T S ( x ) → T ( x ) S ( x ) → T ( x ) T ′ ( x , y ) → T ′ ( y , x ) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 37

  59. Equivalence Σ Σ ′ �≡ log S T S T S ( x ) → T ( x ) S ( x ) → T ( x ) T ′ ( x , y ) → T ′ ( y , x ) Observation If we are interested in typical data exchange, i.e. the universal solutions, M ′ is “just as good as” M , and has smaller cardinality. Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 37

  60. Relaxed Notions of Equivalence DE equivalence Data-exchange (DE) equivalence does not distinguish mappings which behave in the same way for data exchange. Σ Σ ′ ≡ DE S T S T Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 38

  61. Relaxed Notions of Equivalence DE equivalence Data-exchange (DE) equivalence does not distinguish mappings which behave in the same way for data exchange. Σ Σ ′ ≡ DE S T S T Definition [FKNP08] M and M ′ are data-exchange equivalent , if for every source instance I UnivSol( I , M ) = UnivSol( I , M ′ ) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 38

  62. Relaxed Notions of Equivalence CQ equivalence Conjunctive-query (CQ) equivalence does not distinguish mappings which behave similarly for answering conjunctive queries. Σ Σ ′ ≡ CQ S T S T Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 39

  63. Relaxed Notions of Equivalence CQ equivalence Conjunctive-query (CQ) equivalence does not distinguish mappings which behave similarly for answering conjunctive queries. Σ Σ ′ ≡ CQ S T S T Definition [FKNP08] M and M ′ are conjunctive-query equivalent , if for every source instance I and every CQ q , either Sol( I , M ) = Sol( I , M ′ ) = ∅ or cert( q , I , M ) = cert( q , I , M ′ ) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 39

  64. Relaxed Notions of Equivalence Σ Σ ′ ≡ CQ S T S T Proposition [FKNP08] Assume that the following holds for every source instance I : Sol( I , M ) � = ∅ ⇒ UnivSol( I , M ) � = ∅ Then M and M ′ are conjunctive-query equivalent , if for every source instance I , either Sol( I , M ) = Sol( I , M ′ ) = ∅ or core( I , M ) = core( I , M ′ ) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 40

  65. Hierarchy of Equivalences Proposition [FKNP08] Let M = ( S , T , Σ) and M ′ = ( S , T , Σ ′ ) be two schema mappings. M ≡ log M ′ ⇒ M ≡ DE M ′ ⇒ M ≡ CQ M ′ log DE CQ Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 41

  66. Hierarchy of Equivalences But is this hierarchy of optimization potential proper, or does it collapse? log log DE DE CQ CQ This of course depends on the class of schema mappings. Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 42

  67. Hierarchy of Equivalences Σ Σ ′ S T S T T ( x , y ) → T ( y , x ) ∅ (s-t tgds and target tgds) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 43

  68. Hierarchy of Equivalences Σ ≡ DE Σ ′ S T S T �≡ log T ( x , y ) → T ( y , x ) ∅ (s-t tgds and target tgds) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 43

  69. Hierarchy of Equivalences Σ ≡ DE Σ ′ S T S T �≡ log T ( x , y ) → T ( y , x ) ∅ (s-t tgds and target tgds) Observation UnivSol( I , M ) = UnivSol( I , M ′ ) = {∅} , however for any I the solution J = { T ( a , b ) } is a solution under M but not under M ′ Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 43

  70. Hierarchy of Equivalences Σ Σ ′ S T S T S ( x ) → T ( x , x ) S ( x ) → T ( x , x ) T ( x , y ) ∧ T ( y , x ) → T ( x , x ) T ( x , y ) ∧ T ( y , z ) ∧ T ( z , x ) → T ( x , x ) (s-t tgds and target tgds) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 44

  71. Hierarchy of Equivalences ≡ CQ Σ Σ ′ S T S T �≡ DE S ( x ) → T ( x , x ) S ( x ) → T ( x , x ) T ( x , y ) ∧ T ( y , x ) → T ( x , x ) T ( x , y ) ∧ T ( y , z ) ∧ T ( z , x ) → T ( x , x ) (s-t tgds and target tgds) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 44

  72. Hierarchy of Equivalences ≡ CQ Σ Σ ′ S T S T �≡ DE S ( x ) → T ( x , x ) S ( x ) → T ( x , x ) T ( x , y ) ∧ T ( y , x ) → T ( x , x ) T ( x , y ) ∧ T ( y , z ) ∧ T ( z , x ) → T ( x , x ) (s-t tgds and target tgds) Observation This is a universal solution for I = { S (1) } under M , but not M ′ : J = { T (1 , 1) , T ( x , y ) , T ( y , z ) , T ( z , x ) } While J is universal for M and M ′ , J is no solution for I under M ′ . Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 44

  73. Hierarchy of Equivalences s-t tgds – no additional optimization power log – all three equivalences decidable DE CQ s-t tgds and target tgds – additional optimization power log – DE- and CQ-equivalence undecidable DE CQ Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 45

  74. CQ-Equivalence to s-t tgds Theorem [FKNP08] If M is specified by full s-t tgds and full target tgds, then the following statements are equivalent: M has bounded parallel chase There is an M ′ ≡ CQ M specified by full s-t tgds There is an M ′ ≡ CQ M specified by s-t tgds There is an M ′ ≡ CQ M specified by an SO tgd Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 46

  75. CQ-Equivalence to s-t tgds Σ Σ ′ ≡ CQ S T S T ∃ f ∀ x , y S ( x , y ) → T ( x , y ) ∧ ∀ x S ( x , x ) → T ( x , f ( x )) ∧ ∀ x S ( x , x ) ∧ x = f ( x ) → W ( x ) Emanuel Sallinger DEIS’10 – 11 November, 2010 Page 47

Recommend


More recommend