Ev Eval aluati uating ng Re Reac achabi abili lity ty Qu Quer erie ies ov over er La Large e Soc ocia ial l Gr Grap aphs Imen en BEN DHI HIA Advisor isors: Talel ABEDESSALEM Mauro SOZIO Télé lécom om ParisTech Evaluating Reachability Queries Over Large Social Graphs
Outli line Intro troduction duction to Reachabilit bility and Applic icat ations ns Existing isting Approac aches hes Evaluating luating Access Control trol Reachab abil ilit ity Queries ies • Reacha chabili bility ty backbo kbone e disc scovery overy • 2-hop hop index ex const struc ructi tion on • An Answe weri ring ng qu queries es Ong ngoing oing Wo Work Télé lécom om ParisTech Evaluating Reachability Queries Over Large Social Graphs page 1
Outli line Intro troduction duction to Reachabilit bility and Applic icat ations ns Existing isting Approac aches hes Evaluating luating Access Cont ntrol rol Reachabi ability ity Que ueries ies • Reachabili chability ty backb kbone one discovery overy • 2-hop hop index ex const struc ructi tion on • An Answe weri ring ng qu queries es Ongoin oing g Work Télé lécom om ParisTech Evaluating Reachability Queries Over Large Social Graphs page 2
Introductio oduction to rea eachabil bilit ity Use cases: s: colleague friend friend U V friend friend babysitter babysitter U V Privacy preference Constrained reachability query Priv ivac acy y policies ies evaluat ation ion Const strained rained rechabili bilibity ity queries ries evaluatio tion. n. • 2 to 3 different labels • Distance (up to 4) according to real world scenarios Télé lécom om ParisTech Evaluating Reachability Queries Over Large Social Graphs page 3
App pplic ication ations Social networks Bioinformatics Télé lécom om ParisTech Evaluating Reachability Queries Over Large Social Graphs page 4
Constra rain ined ed Rea eachabil bilit ity Problem lem The problem em: Given two vertices u and v in a directed graph G, is v reachable from u via a given path? A path is a sequence of constraints on label l order er and distan tance ce . 11 b b g e 13 10 12 ?Query(1, a\a\b, 11) a b a a a Yes a 6 7 8 9 ?Query(3, a\a\b, 9) e c b f a a 3 4 5 No a d f a 1 2 Télé lécom om ParisTech Evaluating Reachability Queries Over Large Social Graphs page 5
Outli line Intro troduction duction to Reachabilit bility and Applic icat ations ns Existing isting Approac aches hes Evaluating luating Access Cont ntrol rol Reachabi ability ity Que ueries ies • Reachabili chability ty backb kbone one discovery overy • 2-hop hop index ex const struc ructi tion on • An Answe weri ring ng qu queries es Ongoin oing g Work Télé lécom om ParisTech Evaluating Reachability Queries Over Large Social Graphs page 6
Naïv ïve e Soluti tions ons Pre-com comput pute e and s d store e the e transit itiv ive e closure ure (all pa paths bet etween een all po possib ible le pa pair irs of node des) • Then, answer any query in constant time: O(1) • What are Space requirements for an n-node graph ? O(n 2 ) Onlin ine e Sea earch (BFS FS/DFS FS) • Answer query Single Source Shortest Path Algorithm • Minimal additional space required: O( O(n+m) • What is the time complexity to answer query? O( O(n+m) Télé lécom om ParisTech Evaluating Reachability Queries Over Large Social Graphs page 7
Challen enge ge Goal al: : Fi Find ndin ing a g a c compr prom omis ise e betw etwee een n tim ime e an and s d spa pace e consum umpt ption ion to answer er rea eachabil bilit ity quer erie ies. Fi Find nd a c a compact pact rep epres esen entat ation ion for the e tran ansit itiv ive e closure: sure: • whose size is comparable to the data size • that supports connection tests (almost) as fast as the naïve transitive closure lookup • that can be built efficiently for large datasets Télé lécom om ParisTech Evaluating Reachability Queries Over Large Social Graphs page 8
Rel elated ed Wo Work Two main categori gories es of approac aches: hes: Using ng spanning ning structur ures (chains ins and trees) Path- tree (Jin et al. ’08) Label-constraint reachability queries (Jin et al. ‘10) Using ng 2-hop strategy gy 2- hop labeling (Cohen et al. ‘02) Fast graph pattern matching (Wang et al.‘08) Télé lécom om ParisTech Evaluating Reachability Queries Over Large Social Graphs page 9
Shortcomin rtcomings gs No Not distance ance-aware are. Constraints straints on label l order r are not respect pected ed. Constraints straints on node proper ertie ties s are not considered. dered. Reach h a bottlen leneck ck when graphs s are large ge Télé lécom om ParisTech Evaluating Reachability Queries Over Large Social Graphs page 10
Outli line Intro troduction duction to Reachabilit bility and Applic icat ations ns Existing isting Approac aches hes Evaluating luating Access Cont ntrol rol Reachabi ability ity Que ueries ies • Reachabili chability ty backb kbone one discovery overy • 2-hop hop index ex const struc ructi tion on • An Answe weri ring ng qu queries es Ongoin oing g Work Télé lécom om ParisTech Evaluating Reachability Queries Over Large Social Graphs page 11
Our App pproach Evalua valuatin ting g Access s Contr trol l Reacha habi bility lity Queri ries cons nsist ists s in three ree main n steps: ps: 1. Reachability backbone discovery 2. Two-hop index construction 3. Reachability query evaluation over reachability backbone Télé lécom om ParisTech Evaluating Reachability Queries Over Large Social Graphs page 12
Outli line Intro troduction duction to Reachabilit bility and Applic icat ations ns Existing isting Approac aches hes Evaluating luating Access Cont ntrol rol Reachabi ability ity Que ueries ies • Reachabili chability ty backb kbone one discovery overy • 2-hop hop index ex const struc ructi tion on • An Answe weri ring ng qu queries es Ongoin oing g Work Télé lécom om ParisTech Evaluating Reachability Queries Over Large Social Graphs page 13
Rea eacha habil bilit ity backbone bone di discove very ry Remark: k: • Multi-graph (with multiple labels) => a set of single labeled graphs. Det etermini ining ng a subset et of nodes that cover two wo-ho hop p paths. s. • Shortest two-hop paths sampling. • Determining degree threshold. i p e o d m g c w 2 a w 1 n h b q f l e k Télé lécom om ParisTech Evaluating Reachability Queries Over Large Social Graphs page 14
Outli line Intro troduction duction to Reachabilit bility and Applic icat ations ns From om Access Cont ntrol rol to Reachability hability Existing isting Approac aches hes Evaluating luating Access Control trol Reachab abil ilit ity Queries ies • Reachabili chability ty backb kbone one discovery overy • 2-hop hop index ex const struc ructi tion on • An Answe weri ring ng qu queries es Ongoin oing g Work Télé lécom om ParisTech Evaluating Reachability Queries Over Large Social Graphs page 15
Main in Ide dea: 2-Hop p Cover er & 2 & 2-Hop p Label elin ing 2-Hop op cover is is a a set of of hops (u, u,v) ) so that every conn nnected ted pair is is covered red by by 2 2 hops For each node x , , we mainta tain in two sets s of labelings lings (which h are simply mply lists ts of no nodes): s): L in in (x) (x) and nd L out (x) x) v L out (u) L in (v) ≠ u u can reach v out (u) in (v) u w v (Cohe ohen n et et al., SODA 2002) Télé lécom om ParisTech Evaluating Reachability Queries Over Large Social Graphs page 16
2-hop op Cover ers Goal: l: • Find nd a cover which mini nimize mizes the nu number ber of of cent nters ers w i Proble oblem m is is NP NP-hard ard • => Approxima oximatio tion n is is required ired Two main main ingredie edient nts of of the 2-hop cover algorithm rithm: • Set t cover algorit rithm hm. • Densest sest subgraph raph algorit rithm hm. Télé lécom om ParisTech Evaluating Reachability Queries Over Large Social Graphs page 17
Outli line Intro troduction duction to Reachabilit bility and Applic icat ations ns Existing isting Approac aches hes Evaluating luating Access Cont ntrol rol Reachabi ability ity Que ueries ies • Reachabili chability ty backb kbone one discovery overy • 2-hop hop index ex const struc ructi tion on • An Answe weri ring ng qu queries es Ongoin oing g Work Télé lécom om ParisTech Evaluating Reachability Queries Over Large Social Graphs page 18
Answer erin ing quer erie ies Reach chab abili ility comput putation tion via reacha chabi bility lity backb kbone ne • Performing two local BFS searches for accessing reachability backbone • Reachability join test Télé lécom om ParisTech Evaluating Reachability Queries Over Large Social Graphs page 19
Recommend
More recommend