fast reliability search in uncertain graphs
play

Fast Reliability Search in Uncertain Graphs Arijit Khan, Francesco - PowerPoint PPT Presentation

Fast Reliability Search in Uncertain Graphs Arijit Khan, Francesco Bonchi, Aristides Gionis, Francesco Gullo S ystems Group, ETH Zurich Y ahoo Labs, Spain Aalto University, Finland Uncertain Graphs 0.1 0.5 U 0.2 S ocial Net work T


  1. Fast Reliability Search in Uncertain Graphs Arijit Khan, Francesco Bonchi, Aristides Gionis, Francesco Gullo S ystems Group, ETH Zurich Y ahoo Labs, Spain Aalto University, Finland

  2. Uncertain Graphs 0.1 0.5 U 0.2 S ocial Net work T Traffic Net work S 0.5 0.6 V 0.6 Ad-hoc Mobile Net work 0.3 Prot ein-int eraction Net work W 0.7 Uncertain Graph 1

  3. Motivation M obile Ad-hoc Network: find the set of sink nodes where a source node can deliver a 0.1 packet with high probability U 0.2 0.5 T S 0.5 Traffic Network: find a set of 0.6 V 0.6 target locations reachable from 0.3 a source location with high probability W 0.7 Social Network: find a set of Packet Delivery Probability users who could be influenced in Mobile Ad-hoc Network with high probability by a target user 2

  4. Motivation M obile Ad-hoc Network: find the set of sink nodes where a source node can deliver a 0.1 packet with high probability U 0.2 0.5 T S 0.5 Traffic Network: find a set of 0.6 V 0.6 target locations reachable from 0.3 a source location with high probability W 0.7 Social Network: find a set of Packet Delivery Probability users who could be influenced in Mobile Ad-hoc Network with high probability by a target user 2

  5. Reliability in Uncertain Graphs 0.1 0.5 U U 0.2 T T S 0.5 S V 0.6 V Sample Edges 0.6 0.3 W W 0.7 Certain Graph Uncertain Graph (Possible World) 3

  6. Reliability in Uncertain Graphs 0.1 0.5 U U 0.2 T T S 0.5 S V 0.6 V Sample Edges 0.6 0.3 W W 0.7 Certain Graph Uncertain Graph (Possible World) Identity Function 3

  7. Reliability Search in Uncertain Graphs Given an uncertain graph G, a probability threshold ɳ ϵ (0, 1), and a source node S in G, find all nodes in G that are reachable from S with probability greater than or equal to threshold ɳ #P - complete 4

  8. Related Work Two-terminal reliability All-terminal reliability K-terminal reliability M onte-Carlo (M C) sampling Distance-constraint reliability – RHT sampling (Jin et. al., VLDB 2011) 5

  9. Baseline – MC Simulation + BFS 0.1 0.5 U S 0.2 T S 0.5 0.6 V 0.6 MC Sampling + BFS 0.3 U W W 0.7 Uncertain Graph V T Certain Graph Number of Samples (Possible World) 6

  10. Can We Be More Efficient? Given a source node S and a probability threshold ɳ ϵ (0, 1), can we quickly determine the nodes that are certainly not reachable from S with probability greater than or equal to ɳ Indexing (offline) 0.1 0.5 U 0.2 T Filtering + Verification (Online) S 0.5 0.6 V 0.6 0.3 W 0.7 ɳ = 0.5 Uncertain Graph 7

  11. RQ-Tree Index U out (S, *)=0 0.5 0.1 U S, U, W, V, T T 0.2 0.5 ɳ = 0.5 S U out (S, *)=0.496 0.6 0.6 V S, U, W V,T 0.3 U out (S, *)=0.8 W 0.7 ɳ = 0.5 S, W U V T U out (S, *)=0.8 S W Uncertain Graph RQ-Tree Index 7

  12. RQ-Tree: Filtering U out (S, *) M ax-Flow M in-Cut Based Upper S, U, W, V, T =0 Bound: ɳ = 0.5 U out (S, *) S, U, W V,T Edge Capacity: =0.496 c(a) = – log (1 – p(a)) U out (S, *) S, W U V T =0.8 Compute M ax-Flow f from S to Outside Cluster C U out (S, *) S W =0.8 RQ-Tree Index U out (S, C) = 1 – exp(-f) 8

  13. RQ-Tree: Filtering U out (S, *) M ax-Flow M in-Cut Based Upper S, U, W, V, T =0 Bound: ɳ = 0.5 U out (S, *) S, U, W V,T Edge Capacity: =0.496 c(a) = – log (1 – p(a)) U out (S, *) S, W U V T =0.8 Compute M ax-Flow f from S to Outside Cluster C U out (S, *) S W =0.8 RQ-Tree Index U out (S, C) = 1 – exp(-f) Benefits: No false negative (recall = 1) Computation limited only inside cluster C Incremental Max-Flow computation 8

  14. RQ-Tree: Verification Sampling-based Verification: M C-Sample + BFSover the sub-graph formed by the candidate set Pros: high precision, high recall Cons: verification could still be relatively expensive 0.5 U 0.2 Lower-Bound-based Verification: S V M ost-Likely-Path 0.3 0.7 W Pros: precision = 1, high efficiency Pr(S-U-V) = 0.5 * 0.2 = 0.10 Cons: lower recall Pr(S-W-V) = 0.7 * 0.3 = 0.21 Most-Likely-Path: (S-W-V) 9

  15. RQ-Tree: Online Complexity RQ-Tree + RQ-t ree + MC Recursive MC-S ampling-based Lower-Bound-based S ampling S ampling Verificat ion Verificat ion [VLDB ‘ 11] [Our Method] [Our Method]         2 d  O ( m n K ( m n )) O ( m n ) O ( K ( m n )) O ( n ) K = No of Samples m = No of edges n = No of nodes  = No of nodes in the candidate set n  m = No of edges induced by the candidate nodes d = Diameter of the graph 10

  16. RQ-Tree Index Construction S, U, W, V, T 0.5 0.1 U T 0.2 0.5 S S, U, W V,T 0.6 0.6 V 0.3 S, W U V T W 0.7 S W Uncertain Graph RQ-Tree Index Hierarchical Clustering: M inimum-cut balanced bi-partition using M ETIS Edge weight: w(a) = – log (1 – p(a)) 11

  17. Experimental Results # Nodes # Edges #Arc Prob: Mean, S D, Quart iles DBLP 684 911 4 569 982 0.14 ± 0.11, {0.09, 0.09, 0.18} Flickr 78 322 20 343 018 0.09 ± 0.06, {0.06, 0.07, 0.09} BioMine 1 008 201 13 445 048 0.27 ± 0.21, {0.12, 0.22, 0.36} Dataset Characteristics 12

  18. Accuracy Results RQ-Tree-MC RQ-Tree-LB ɳ =0.4 ɳ =0.6 ɳ =0.8 ɳ =0.4 ɳ =0.6 ɳ =0.8 DBLP 0.96 0.99 0.99 1 1 1 Flickr 0.97 0.98 0.98 1 1 1 BioMine 0.95 0.96 0.97 1 1 1 Precision RQ-Tree-MC RQ-Tree-LB ɳ =0.4 ɳ =0.6 ɳ =0.8 ɳ =0.4 ɳ =0.6 ɳ =0.8 DBLP 0.99 0.99 1.00 0.75 0.87 0.91 Flickr 0.98 0.99 0.99 0.76 0.79 0.83 BioMine 0.97 0.98 0.98 0.77 0.81 0.85 Recall 13

  19. Efficiency Results RQ-Tree-MC RQ-Tree-LB MC ɳ =0.4 ɳ =0.6 ɳ =0.8 ɳ =0.4 ɳ =0.6 ɳ =0.8 All ɳ DBLP 43 40 36 1.50 0.60 0.60 588 Flickr 60 59 55 0.21 0.20 0.17 114 BioMine 6062 5417 4974 1.00 0.50 0.50 25 608 Online query-processing time (sec) 14

  20. Pruning Capacity of Filtering Phase Precision of Filtering Phase 15

  21. RQ-Tree in Influence Maximization RQ-Tree index in multi-source reliability query and in influence maximization Expected Spread (Last.FM) Top-k Seed Finding Time (Last.FM) 16

  22. Conclusion Indexing method for answering online reliability queries efficiently and effectively. RQ-tree works very well with lower arc probabilities and with higher probability threshold. In future, we shall study reliability search queries when the arc probabilities are not independent. 17

  23. Questions?

Recommend


More recommend