Repairing Entities using Star Constraints in Multi-relational Graphs Peng Lin 1 Qi Song 1 Yinghui Wu 2,3 Jiaxing Pi 4 1 2 4 3
Erroneous entities: how to capture? Β§ Multi-relational graphs: a labeled graph with attributes on nodes π π Player name: VanPersie playsFor teammate playsFor coachedBy Coach Club Club Player name: Wenger name: AFC name: MU name: Rooney operates trainsAt operates trainsAt worksAt Stadium Facility Stadium Facility name: ATC name: EM name: OT name: AON owner: AHP owner: AHP owner: MUP owner: MUP city: LDN city: BZ city: MAN city: LD π π π π π π π π Graph G: a football database 1
Erroneous entities: how to capture? Β§ Multi-relational graphs: a labeled graph with attributes on nodes Β§ Entity errors: incorrect node attributes π π Player name: VanPersie playsFor teammate playsFor coachedBy Coach Club Club Player name: Wenger name: AFC name: MU name: Rooney operates trainsAt operates trainsAt worksAt Stadium Facility Stadium Facility name: ATC name: EM name: OT name: AON owner: AHP owner: AHP owner: MUP owner: MUP city: LDN city: BZ city: MAN city: LD π π π π π π π π Graph G: a football database 1
Erroneous entities: how to capture? Β§ Multi-relational graphs: a labeled graph with attributes on nodes Β§ Entity errors: incorrect node attributes Β§ Semantics: relevant paths from a center node βFor stadium and facility relevant to player ( π π ) π π Player from Premier League, if they have the same name: VanPersie owner, then they should locate at the same city.β playsFor teammate playsFor coachedBy Coach Club Club Player name: Wenger name: AFC name: MU name: Rooney operates trainsAt operates trainsAt worksAt Stadium Facility Stadium Facility name: ATC name: EM name: OT name: AON owner: AHP owner: AHP owner: MUP owner: MUP city: LDN city: BZ city: MAN city: LD π π π π π π π π Graph G: a football database 1
Regular path queries Regular expressions: π = π π &' π % π|π βͺ π Β§ π π Player name: VanPersie playsFor teammate playsFor coachedBy Coach Club Club Player name: Wenger name: AFC name: MU name: Rooney operates trainsAt operates trainsAt worksAt Facility Stadium Stadium Facility name: ATC name: EM name: OT name: AON owner: AHP owner: AHP owner: MUP owner: MUP city: LDN city: BZ city: MAN city: LD π π π π π π π π Graph G: a football database 2
Regular path queries Regular expressions: π = π π &' π % π|π βͺ π Β§ Β§ Paths from Player to Stadium π ! = (playsFor , operates) βͺ (coachedBy , worksAt) Β§ π π Player name: VanPersie playsFor teammate playsFor coachedBy Coach Club Club Player name: Wenger name: AFC name: MU name: Rooney operates trainsAt operates trainsAt worksAt Facility Stadium Stadium Facility name: ATC name: EM name: OT name: AON owner: AHP owner: AHP owner: MUP owner: MUP city: LDN city: BZ city: MAN city: LD π π π π π π π π Graph G: a football database 2
Regular path queries Regular expressions: π = π π &' π % π|π βͺ π Β§ Β§ Paths from Player to Stadium π ! = (playsFor , operates) βͺ (coachedBy , worksAt) Β§ π π Player Β§ Paths from Player to Facility π " = (playsFor , operates) βͺ (teammate #! , trainsAt) name: VanPersie Β§ playsFor teammate playsFor coachedBy Coach Club Club Player name: Wenger name: AFC name: MU name: Rooney operates trainsAt operates trainsAt worksAt Facility Stadium Stadium Facility name: ATC name: EM name: OT name: AON owner: AHP owner: AHP owner: MUP owner: MUP city: LDN city: BZ city: MAN city: LD π π π π π π π π Graph G: a football database 2
Contributions StarRepair framework Repair π»β Graph π» , StarFDs Ξ£ Error detection Repair ( π» does not satisfy Ξ£ ) ( π»β satisfies Ξ£ ) 3
Contributions StarFDs: star functional dependencies Entity repair problem: minimum new constraints for graphs editing cost, NP-hard and APX-hard StarRepair framework Repair π»β Graph π» , StarFDs Ξ£ Error detection Repair ( π» does not satisfy Ξ£ ) ( π»β satisfies Ξ£ ) Feasible framework with provable guarantees whenever possible 3
Contributions StarFDs: star functional dependencies Entity repair problem: minimum new constraints for graphs editing cost, NP-hard and APX-hard StarRepair framework Repair π»β Graph π» , StarFDs Ξ£ Error detection Repair ( π» does not satisfy Ξ£ ) ( π»β satisfies Ξ£ ) Repair workflow Is approximable? Feasible framework with provable guarantees whenever possible No Yes Is optimal repairable? Heuristic solution Yes No Optimal solution Approximation solution 3
Star constraints StarFDs: π = (π(π£ ( ), π β π) Β§ Star pattern π(π£ ( ) : Β§ Value constraints: π β π Β§ 4
Star constraints StarFDs: π = (π(π£ ( ), π β π) Β§ Star pattern π(π£ ( ) : Β§ Value constraints: π β π Β§ - A two-level tree with center node π£ ( - Each branch is a regular expression π π Player πΊ π πΊ π Stadium Facility π π π π π % = (playsFor 0 operates) βͺ (coachedBy 0 worksAt) π # = (playsFor 0 operates) βͺ (teammate $% 0 trainsAt) 4
Star constraints StarFDs: π = (π(π£ ( ), π β π) Β§ Star pattern π(π£ ( ) : Β§ Value constraints: π β π Β§ - A two-level tree with center node π£ ( - π and π are two sets of literals Literals: π£. π΅ = π , or π£. π΅ = π£ ) . π΅β² - Each branch is a regular expression - π π Player π : π£ $ . league = EPL, π£ ! . owner = π£ " . owner πΊ π πΊ π π : π£ ! . city = π£ " . city Stadium Facility π π π π π % = (playsFor 0 operates) βͺ (coachedBy 0 worksAt) π # = (playsFor 0 operates) βͺ (teammate $% 0 trainsAt) 4
Star constraints Β§ Matching semantics: maximum set matched by star pattern π π Player πΊ π πΊ π Facility Stadium π π π π Star pattern π(π£ $ ) π : π£ & . league = EPL, π£ % . owner = π£ # . owner π : π£ % . city = π£ # . city 5
Star constraints π π matches π π Β§ Matching semantics: maximum set matched by star pattern π π matches π π and π π π π matches π π and π π π π Player name: VanPersie π π Player playsFor teammate playsFor coachedBy πΊ π πΊ π Coach Club Club Player name: Wenger name: AFC name: MU name: Rooney Facility Stadium operates trainsAt operates trainsAt worksAt π π π π Facility Stadium Stadium Facility Star pattern π(π£ $ ) name: ATC name: EM name: OT name: AON owner: AHP owner: AHP owner: MUP owner: MUP π : π£ & . league = EPL, π£ % . owner = π£ # . owner city: LDN city: BZ city: MAN city: LD π : π£ % . city = π£ # . city π π π π π π π π 5
Star constraints π π matches π π Β§ Matching semantics: maximum set matched by star pattern π π matches π π and π π Inconsistencies π± : matches that π holds but π does not hold Β§ π π matches π π and π π π π Player name: VanPersie π π Player playsFor teammate playsFor coachedBy πΊ π πΊ π Coach Club Club Player name: Wenger name: AFC name: MU name: Rooney Facility Stadium operates trainsAt operates trainsAt worksAt π π π π Facility Stadium Stadium Facility Star pattern π(π£ $ ) name: ATC name: EM name: OT name: AON owner: AHP owner: AHP owner: MUP owner: MUP π : π£ & . league = EPL, π£ % . owner = π£ # . owner city: LDN city: BZ city: MAN city: LD π : π£ % . city = π£ # . city π π π π π π π π 5
Summary of results Problem Description Hardness Solution Input: Ξ£ Satisfiability NP-complete decide whether there exists π» that satisfies Ξ£ Input: Ξ£ and π Implication coNP-hard decide whether for all π» satisfy Ξ£ , they satisfy π Input: π» and Ξ£ Error detection PTIME Evaluate regular path queries and validate values Output: all inconsistencies π± time complexity: π( Ξ£ V + |π|( π + |πΉ|)) (validation) - Input: Ξ£ and π» that does not satisfy Ξ£ Repair NP-hard Approximable cases (PTIME checkable) time complexity π( π± Ξ£ ! + π± ( π± Ξ£ ! + |π±| Ξ£ )) Ouput: π»β² that satisfies Ξ£ with least repair cost APX-hard - approximation ratio: π± Ξ£ ! - Optimal cases time complexity π( π± Ξ£ )) - Heuristic cases time complexity π( π± Ξ£ ! + π± ( π± Ξ£ ! + |π±| Ξ£ )) - bounded repairable: cost β€ π± - Notations π» : graph π : nodes πΉ : edges Β§ Ξ£ : a set of StarFDs π : a single StarFD π± : all inconsistencies. 6
Updates and repairs Updates π : operators π = (π€. π΅, π, π) with editing cost cost π = β (β+ cost π Β§ Repair π : applying π to π» , such that obtain π»β² that satisfies Ξ£ Β§ 7
Recommend
More recommend