day 5 ESSLLI 2016 Bolzano, Italy ∀∃ ¬ Logical foundations of databases Diego Figueira Gabriele Puppis CNRS LaBRI
Recap • Acyclic Conjunctive Q ueries • Join Trees • Evaluation of ACQ (LOGCFL-complete) • Ears, GYO algorithm for testing acyclicity • Tree decomposition, tree-width of CQ • Evaluation of bounded tree-width CQs (LOGCFL-complete) • Bounded variable fragment of FO, evaluation in PTIME • Acyclic Conjunctive Q ueries
Ehrenfeucht - Fraïssé games S 1 and S 2 are No they’re n - equivalent! NOT !!!! Duplicator Spoiler Ti ey play for n rounds on the board ( S 1 , S 2 ). At each round i : Spoiler chooses a node x i from S 1 (resp. y i from S 2 ) Duplicator answers with a node y i from S 2 (resp. x i from S 1 ) trying to maintain an isomorphism between S 1 | { x i } i and S 2 | { y i } i
Ehrenfeucht - Fraïssé games On non-isomorphic fi nite structures, Spoiler wins eventually… Why? …and he o fu en wins very quickly: 1 1 2 2 2 n - 1 nodes 2 n nodes But there are non-isomorphic in fi nite structures where Duplicator can survive for arbitrarily many rounds (not necessarily forever!) Any idea? 1 1 2 2 Given n , ℤ ℤ ⊎ ℤ at each round i = 1, …, n , pairs of marked nodes in S 1 and S 2 must be either at equal distance or at distance ≥ 2 n - i
Ehrenfeucht - Fraïssé games [Fraïssé '50, Ehrenfeucht '60] Ti eorem . S 1 and S 2 are n - equivalent i ff Duplicator has a strategy to survive n rounds in the EF game on S 1 and S 2 . Proof ideas for the if-direction (from Duplicator’s winning strategy to n - equivalence) Consider φ with quanti fi er rank n . Suppose S 1 ⊨ φ and Duplicator survives n rounds on S 1 , S 2 . We need to prove that S 2 ⊨ φ . A new game to evaluate formulas….
The semantics game Assume w.l.o.g. that φ is in negation normal form. push negations inside: ¬ ∀ φ ⇝ ∃ ¬ φ ¬ ∃ φ ⇝ ∀ ¬ φ ¬( φ ⋀ ψ ) ⇝ ¬ φ ⋁ ¬ ψ … Whether S ⊨ φ can be decided by a new game between two players, True and False : • φ = E ( x , y ) → True wins if nodes marked x and y are connected by an edge, otherwise he loses • φ = ∃ x φ '( x ) → True moves by marking a node x in S , the game continues with φ ' • φ = ∀ y φ '( y ) → False moves by marking a node y in S , the game continues with φ ' • φ = φ 1 ∨ φ 2 → True moves by choosing φ 1 or φ 2 , the game continues with what he chose • φ = φ 1 ⋀ φ 2 → False moves by choosing φ 1 or φ 2 , the game continues with what he chose • … Lemma . S ⊨ φ i ff True wins the semantics game.
Ehrenfeucht - Fraïssé games [Fraïssé '50, Ehrenfeucht '60] Ti eorem . S 1 and S 2 are n - equivalent i ff Duplicator has a strategy to survive n rounds in the EF game on S 1 and S 2 . Proof ideas for the if-direction (from Duplicator’s winning strategy to n - equivalence) True wins the game on S 1 Consider φ with quanti fi er rank n . Suppose S 1 ⊨ φ and Duplicator survives n rounds on S 1 , S 2 . We need to prove that S 2 ⊨ φ . True wins the game on S 2 Turn winning strategy for True in S 1 into winning strategy for True in S 2 ….
Ehrenfeucht - Fraïssé games [Fraïssé '50, Ehrenfeucht '60] Ti eorem . S 1 and S 2 are n - equivalent i ff Duplicator has a strategy to survive n rounds in the EF game on S 1 and S 2 . Proof ideas for the if-direction (from Duplicator’s winning strategy to n - equivalence) True wins the game on S 1 Consider φ with quanti fi er rank n . Suppose S 1 ⊨ φ and Duplicator survives n rounds on S 1 , S 2 . We need to prove that S 2 ⊨ φ . S D F F D T S T S 1 S 2
Definability in FO [Fraïssé '50, Ehrenfeucht '60] Ti eorem . S 1 and S 2 are n - equivalent i ff Duplicator has a strategy to survive n rounds in the EF game on S 1 and S 2 . Corollary . A property P is not de fi nable in FO i ff ∀ n ∃ S 1 ∈ P ∃ S 2 ∉ P Duplicator can survive n rounds on S 1 and S 2 . Example: P = { connected graphs } . Given n , take S 1 ∈ P large enough and S 2 = S 1 ⊎ S 1 ∉ P … … … 2 1 1 2 … … …
Ehrenfeucht - Fraïssé games Several properties can be proved to be not FO-de fi nable : • connectivity ( previous slide ) • even / odd size Your turn now! …given n , take S 1 = large even structure S 2 = large odd structure… • 2-colorability Given n , take S 1 = large even cycle S 2 = large odd cycle … … • fi niteness • acyclicity … … …
0 - 1 Law A di ff erent perspective: a coarser view on expressiveness… What percentage of graphs verify a given FO sentence?
0 - 1 Law μ n ( P ) = “probability that property P holds in a random graph with n nodes” Uniform distribution ( each pair of nodes has an edge with probability ½ ) C n = { graphs with n nodes } | { G ∈ C n | G ⊨ P } | E.g. for P = “the graph is complete” μ n ( P ) = | C n | 1 1 = μ 3 ( P ) = = 2 n 2 2 3 2 | C 3 | μ ∞ ( P ) = lim μ n ( P ) n → ∞
0 - 1 Law Ti eorem . [Glebskii et al. ’69, Fagin ’76] For every FO sentence φ , μ ∞ ( φ ) is either 0 or 1 . Examples: • φ = “there is a triangle” μ 3 ( φ ) = 1 / | C 3 | μ 3n ( φ ) ≥ 1 – (1– 1 / | C 3 | ) n → 1 • φ H = “there is an occurrence of H as induced sub-graph ” μ ∞ ( φ H ) = 1 • φ = “there no 5-clique” μ ∞ ( φ ) = 0 • φ = “even number of edges” μ ∞ ( φ ) = 1 / 2 Your turn! μ ∞ ( φ ) not even de fi ned • φ = “even number of nodes” • φ = “more edges than nodes” μ ∞ ( φ ) = 1 ( yet not FO-de fi nable! )
0 - 1 Law For every FO sentence φ , μ ∞ ( φ ) is either 0 or 1 . Let k = quanti fi er rank of φ δ k = ∀ x 1 , …, x k ∀ y 1 , …, y k ∃ z ⋀ i,j x i ≠ y j ⋀ E ( x i , z ) ⋀ ¬ E ( y j , z ) z ( Extension Formula/Axiom ) Fact 1: If G ⊨ δ k ⋀ H ⊨ δ k then Fact 2: μ ∞ ( δ k ) = 1 Duplicator survives k rounds on G , H ( δ k is almost surely true ) a) Ti ere is G G ⊨ δ k ⋀ φ ⇒ (by Fact 1) ∀ H : If H ⊨ δ k then H ⊨ φ Ti us, μ ∞ ( δ k ) ≤ μ ∞ ( φ ) 2 cases ⇒ (by Fact 2) μ ∞ ( δ k ) = 1, hence μ ∞ ( φ ) = 1 b) Ti ere is no G ⊨ δ k ⋀ φ ⇒ (by Fact 2) there is G ⊨ δ k , ⇒ G ⊨ δ k ⋀ ¬ φ ⇒ (by case a) μ ∞ ( ¬ φ ) = 1
0 - 1 Law For every FO sentence φ , μ ∞ ( φ ) is either 0 or 1, and this depends on whether RADO ⊨ φ each pair of nodes i , j is connected if i -th bit of j is 1 RADO = each pair of nodes i , j the unique is connected with graph that probability 1/2 satis fi es δ k for all k
0 - 1 Law [Grandjean ’83] Ti eorem . Ti e problem of deciding whether an FO sentence is almost surely true ( μ ∞ = 1) is PSPACE-complete. unsatisfiable valid almost surely almost surely formulas formulas false formulas true formulas e e l l b b a a E d d i C i c c e e A d d P n n S u u P Q uery evaluation on large databases: Don’t bother evaluating an FO query, it’s either almost surely true or almost surely false !
0 - 1 Law Does the 0-1 Law apply to real-life databases? Not quite: database constraints easily spoil Extension Axiom. Consider: • functional constraint ∀ x, x’, y, y’ ( E( x , y ) ⋀ E( x , y ’) ⇒ y = y ’ ) ⋀ ( E( x , y ) ⋀ E( x ’, y ) ⇒ x = x ’ ) (E is a permutation) • FO query φ = ¬ ∃ x E( x , x ) Probability that a permutation E satis fi es φ = !n / n! → e -1 = 0.3679… 0-1 Law only applies to unconstrained databases…
Another technique: Locality Idea: First order logic can only express “local” properties Local = properties of nodes which are close to one another
Hanf locality De fi nition. Ti e Gaifman graph of a structure S = ( V , R 1 , … , R m ) is the undirected graph G S = ( V , E ) where E = { ( u , v ) | ∃ (…, u , …, v , …) ∈ R i for some i } Car Country Agent Name Drives The Gaifman graph of Aston Martin UK 007 James Bond Aston Martin a graph G is the underlying undirected graph. 200 Mr Smith Cadillac Cadillac USA 201 Mrs Smith Mercedes Mercedes Germany BMW Germany 3 Jason Bourne BMW 201 007 Aston Martin Mercedes Mrs Smith James Bond UK Germany USA Jason Bourne Mr Smith BMW 200 Cadillac 3
Hanf locality • dist ( u , v ) = distance between u and v in the Gaifman graph • S [ u , r ] = sub-structure induced by { v | dist ( u , v ) ≤ r } = ball around u of radius r Car Country Agent Name Drives Aston Martin UK 007 James Bond Aston Martin 200 Mr Smith Cadillac Cadillac USA u u 201 Mrs Smith Mercedes Mercedes Germany BMW Germany 3 Jason Bourne BMW u 201 007 Aston Martin Mercedes Mrs Smith James Bond UK Germany USA Jason Bourne Mr Smith BMW 200 Cadillac 3
Hanf locality De fi nition. Two structures S 1 and S 2 are Hanf ( r , t ) - equivalent i ff for each structure B , the two numbers # u s.t. S 1 [ u , r ] ≅ B # v s.t. S 2 [ v , r ] ≅ B are either the same or both ≥ t . Example. S 1 , S 2 are Hanf (1, 1) - equivalent i ff they have the same balls of radius 1
Hanf locality De fi nition. Two structures S 1 and S 2 are Hanf ( r , t ) - equivalent i ff for each structure B , the two numbers # u s.t. S 1 [ u , r ] ≅ B # v s.t. S 2 [ v , r ] ≅ B are either the same or both ≥ t . Example. K n , K n+1 are not Hanf (1, 1) - equivalent
Recommend
More recommend