path query routing in unstructured peer to peer networks
play

Path Query Routing in Unstructured Peer-to-Peer Networks Nicolas - PowerPoint PPT Presentation

Introduction Related work Content based routing of path queries Conclusion Path Query Routing in Unstructured Peer-to-Peer Networks Nicolas Bonnel, Gildas Mnier, Pierre-francois Marteau Laboratoire Valoria - Universit de Bretagne Sud


  1. Introduction Related work Content based routing of path queries Conclusion Path Query Routing in Unstructured Peer-to-Peer Networks Nicolas Bonnel, Gildas Ménier, Pierre-francois Marteau Laboratoire Valoria - Université de Bretagne Sud August 29, 2007 1 / 21

  2. Introduction Related work Context Content based routing of path queries Overview Conclusion Introduction 1 Context Overview Related work 2 P2P architecture Bloom filters Content based routing of path queries 3 Multi Level EDBF Clustering Preliminary results Conclusion 4 2 / 21

  3. Introduction Related work Context Content based routing of path queries Overview Conclusion Context Context Indexing very large database Semi-structured information (XML) Need to index the structure of documents Queries Need to answer Exact : article/title approximative queries 1 unkown element : article/ ?/paragraph 0 or more unknown elements : */paragraph 3 / 21

  4. Introduction Related work Context Content based routing of path queries Overview Conclusion Architecture Overview Distributed XML database The system constrains the location and replication of data Resources scavenging Peer to Peer architecture Allow to use more Fault tolerance computers, cheap cost Scalability Ex : SETI 4 / 21

  5. Introduction Related work P2P architecture Content based routing of path queries Bloom filters Conclusion Introduction 1 Context Overview Related work 2 P2P architecture Bloom filters Content based routing of path queries 3 Multi Level EDBF Clustering Preliminary results Conclusion 4 5 / 21

  6. Introduction Related work P2P architecture Content based routing of path queries Bloom filters Conclusion Structured p2p network Advantage Easy to retrieve rare items Limitations Approximatives and ranged queries very costly Load balancing problems Chord, CAN, Tapestry, ... 6 / 21

  7. Introduction Related work P2P architecture Content based routing of path queries Bloom filters Conclusion Untructured p2p network Advantages Highly replicated items can be retrieved at a cheap cost Can control data placement Limitation Very costly to retrieve rare Gnutella [Clip2, 2002], ... items 7 / 21

  8. Introduction Related work P2P architecture Content based routing of path queries Bloom filters Conclusion Bloom Filters [Bloom, 70] Definition A array of m bits. h i : 0 < = i < k k hash functions. insert(x) : ∀ i : A [ h i ( x )] = 1 query(x) : true if ∀ i : A [ h i ( x )] == 1 False positives False positives are possible, but false negatives are not Probability of false positive : ( 1 − ( 1 − 1 m ) kn ) k 8 / 21

  9. Introduction Related work P2P architecture Content based routing of path queries Bloom filters Conclusion Exponentially Decaying Bloom Filter [Kumar, 2005] EDBF θ ( x ) = |∀ i : A [ h i ( x )] == 1 | Can be used to encode stochastic routing tables. θ ( x ) / k : probability to find x among a specific link. n hop count from element : θ ( x ) / k = 1 / d n Update Copy filter of each neighboor Bits of the copy are set to 0 with a probability ( 1 − 1 / d ) OR with local filter ex : propagation with a decay d = 2 9 / 21

  10. Introduction Related work P2P architecture Content based routing of path queries Bloom filters Conclusion Multi Level Bloom Filter [Koloniari, 2004] Breadth Bloom Filter Set of Bloom filters Each element at level i is stored in BBF i Depth Bloom Filter Set of Bloom filters Each XML path of length i is stored in DBF i 10 / 21

  11. Introduction Multi Level EDBF Related work Clustering Content based routing of path queries Preliminary results Conclusion Introduction 1 Context Overview Related work 2 P2P architecture Bloom filters Content based routing of path queries 3 Multi Level EDBF Clustering Preliminary results Conclusion 4 11 / 21

  12. Introduction Multi Level EDBF Related work Clustering Content based routing of path queries Preliminary results Conclusion Multi Level EDBF Breadth EDBF Exponential decaying version of BBF Querying Additional filter to store Product of probabilities elements in a reverse order BBF can answer E / ∗ queries RBBF can answer ∗ / E queries Both filters can answer E / ∗ / E queries 12 / 21

  13. Introduction Multi Level EDBF Related work Clustering Content based routing of path queries Preliminary results Conclusion Data clustering Agent An agent carries an indexed XML path choosen at random Moves randomly on the network If better node → moves the indexed XML path on this node Comparison function : number of XML path’s elements in the filter of the node and its neighborhood Example BBF 1 contains A BBF 2 contains B RBBF 1 contains C RBBF 2 contains B path A / B / C have a score of 4 13 / 21

  14. Introduction Multi Level EDBF Related work Clustering Content based routing of path queries Preliminary results Conclusion Experiments settings Network topology 200 nodes Random graph Node degree between 3 and 8 Settings 260 000 XML documents from Wikipedia (1.5 GByte) Filter’s size : 8192 (2 13 ), 3 filters in sets ( BBF 1 , BBF 2 , BBF 3 , RBBF 1 , RBBF 2 , RBBF 3 ) Number of hash functions : 32 1000 queries generated at random 14 / 21

  15. Introduction Multi Level EDBF Related work Clustering Content based routing of path queries Preliminary results Conclusion Preliminary results 100 2000 1800 80 Filters occupation (%) 1600 Paths indexed per node Filters occupation (%) 1400 Paths moved per node in 1 h 60 1200 1000 40 800 600 20 400 200 0 0 0 10 20 30 40 50 60 70 80 Elapsed time (h) 15 / 21

  16. Introduction Multi Level EDBF Related work Clustering Content based routing of path queries Preliminary results Conclusion Preliminary results 100 90 Queries answered (%) 80 70 60 50 40 SQR, no unknown element RW, no unknown element 30 SQR, 1 unknown element 20 RW, 1 unknown element 10 0 0 200 400 600 800 1000 Hop count limit 2 element queries No unknown element : article/title, section/paragraph, ... 1 unknown element : ?/abstract, article/ ?, ... 16 / 21

  17. Introduction Multi Level EDBF Related work Clustering Content based routing of path queries Preliminary results Conclusion Preliminary results 100 90 Queries answered (%) 80 70 60 50 40 SQR, no unknown element RW, no unknown element 30 SQR, 1 unknown element 20 RW, 1 unknown element 10 0 0 200 400 600 800 1000 Hop count limit 3 element queries No unknown element : article/section/paragraph, ... 1 unknown element : article/ ?/paragraph, ?/section/paragraph, ... 17 / 21

  18. Introduction Multi Level EDBF Related work Clustering Content based routing of path queries Preliminary results Conclusion Preliminary results 100 90 Queries answered (%) 80 70 60 50 SQR RW 40 30 20 10 0 0 200 400 600 800 1000 Hop count limit Ancestor-descendant queries article/*/paragraph, article/*/abstract 18 / 21

  19. Introduction Related work Content based routing of path queries Conclusion Introduction 1 Context Overview Related work 2 P2P architecture Bloom filters Content based routing of path queries 3 Multi Level EDBF Clustering Preliminary results Conclusion 4 19 / 21

  20. Introduction Related work Content based routing of path queries Conclusion Conclusion Contribution Routing of approximative XML path queries Data clustering of path indexes. Experiments Increase routing performances compared to random walk Good performances with rare elements Future Work Larger network Information replication Take into account element attributes 20 / 21

  21. Introduction Related work Content based routing of path queries Conclusion Acknowledgements This research was supported by Region Bretagne. 21 / 21

  22. Introduction Related work Content based routing of path queries Conclusion References Burton H. Bloom. Space/time trade-offs in hash coding with allowable errors. Communications of the ACM , 13(7) :422–426, 1970. Clip2. The gnutella protocol specification v0.4, 2002. G. Koloniari and E. Pitoura. Content-based routing of path queries in peer-to-peer systems. In Proceedings of the EDBT’04 International Conference, Heraklion, Crete, Greece , 2004. Abhishek Kumar, Jun Xu, and Ellen W. Zegura. Efficient and scalable query routing for unstructured peer-to-peer networks. In Proc. of IEEE Infocom , 2005. 22 / 21

Recommend


More recommend