content based architectures for networking
play

Content Based Architectures for Networking Aaditeshwar Seth - PowerPoint PPT Presentation

Content Based Architectures for Networking Aaditeshwar Seth Department of Computer Science, IIT Delhi Joint work with A.Ruhela, R.Tripathy, A.Mahla, D.Martin, I.Ahuja, Q.Niyaz, A.Dubey, S.Brahmi, A.Subramaniam, Z.Koradia, A.Singh and


  1. Content Based Architectures for Networking Aaditeshwar Seth Department of Computer Science, IIT Delhi Joint work with A.Ruhela, R.Tripathy, A.Mahla, D.Martin, I.Ahuja, Q.Niyaz, A.Dubey, S.Brahmi, A.Subramaniam, Z.Koradia, A.Singh and A.Mahanti, S. Ardon, S. Triukose, H.Saran, A.Bagchi November 2011

  2. Gen 1: Phone numbers carry path information 2

  3. Gen 2: Endpoints have addresses, nodes switch packets [Baran, 1964]

  4. Today

  5. Internet ~ Content transfer Cisco, 2009 Moon, et al, IMC 2007 5

  6. Content delivery networks, flattening Internet Internet Atlas, NANOG, 2009 6

  7. Content sharing via social networking websites Pew Internet, 2008

  8. Gen 3: Semantic content based networks  Users care about content, not where it is available  Treat content objects as first class entities in the network  Push/pull content objects  Content lookup servers  Routers can cache content  Semantic cache replacement, pre-fetching policies  Utilize content metadata  Utilize OSN signals about content metadata 8

  9. OSN aided content distribution

  10. Network architectures a. CDN guided by data from online social networking websites b. P2P gossip on social network overlay 10

  11. Dataset  7M users  196M tweets  Duration: June 11, 2009 to Sept 1, 2009  OpenCalais to identify tweet topics  6M topics, reduced to 0.9M topics having at least 15 users  Sampled 4K topics for detailed analysis  Yahoo geocoding API to identify user locations  4M users with locations 11

  12. Topic spread across geographies • Can use traffic spikes in originating region to predict spikes in other regions • LDA for topic identification, CF and follower-count for country similarity 12

  13. Popular topics have a large spread, unpopular topics confined to few countries • High degree of spatial locality can be useful for content placement and caching • Explore at city/region level too 13

  14. Does initiator popularity predict topic popularity? 14

  15. Tracking giant component growth can help • Dominant giant component in popular topics, not as dominant in less popular topics • But growth of giant component seems to always coincides with popularity growth. Methods to track giant component growth dynamically? 15

  16. Other interesting observations Periodic topics Ephemeral Vs stable Sharp/slow growth and decay 16

  17. Next steps  Online event detection algorithms  Predictors for geographic spread of topics  Simulations to evaluate CDN Vs. P2P content distribution architectures  Cache replacement policies  Pre-fetching  Centralized and distributed algorithms 17

  18. Content based networks for rural areas

  19. Community media in rural areas  Variety of mechanisms  Community radio  Community video  Wall newspapers  … 19

  20. • Digital Green: 1500+ videos (5 states) • Community radio: 5GB new content per month • Rural news: 40,000+ calls per month per state

  21. Ideas and awareness for creating relevant programs Topic of the month • Employment • Right to Food • Water and sanitation • Maternal and child health Produce impactful programs • Civic activism • Political change 21

  22. Social networking and content sharing 22

  23. Digital Green dataset analysis

  24. A content distribution network for rural areas Constraints Design principles Application use-cases: Publish-subscribe, broadcast, multicast, browsing and content download Content-based network. Content objects are Local content production and consumption. first class entities; routers can cache content, Metadata can reveal access patterns examine metadata Content transfer capabilities to/from local rendezvous points in villages Applications are tolerant of delays Delay tolerant data transfer. Always-on content channel for route initializations and 2G coverage is not sufficient for large content 24 content download/upload requests transfers. But ubiquitously available now

  25. Network stack 25

  26. Simulation analysis  Topology layout  Block-block, block-district roads  Villages clustered around blocks  Village-village, village-block  Movement schedules  Village-block by ad hoc means of transport. Once a day  Block-block, block to district, by bus. Few times a day  Algorithms  Unicast with caching, multicast, multicast with pre-fetching, optimal multicast  Cache replacement: LRU, seasonal preference 26

  27. Download requirements at gateway 27

  28. Effects of network topology Short circuiting across villages helps in mesh-like topologies

  29. Effects of consumption patterns  Not much improvement with seasonal preference according to indicated relevance periods  DG screens videos throughout the year to sustain community interest  Not much improvement with cache sizes beyond 1GB  DG makes rounds of villages screening the same set of videos, then moves on to other videos  Next steps  More rigorous analysis of cache occupancy  Dataset and topology modeling to design generic policies  Small-scale field deployment 29

  30. Application framework for mobile devices with flaky Internet connections

  31. Mobile traffic Cisco, 2011 31

  32. App server Telcos are already putting caching proxies in their access networks

  33. Offline application development  Applications run offline from a local cache  Key-value get/put API to data-store  Data-store synchronization provided by the middleware itself  Optimized transport layer  Control-data separation  Other features  Data summarization  Namespace subscriptions  Security & access control  Transactions  Consistency 33 Middleware

  34. Evidence of traffic shaping in cellular data networks? Download on GPRS 34

  35. Or, aggregate slot allocation on uplink? Ack bunching at server trace Client trace is clean however Download on GPRS 35

  36. Non-uniform latencies on uplink 300ms 800ms 200ms 700ms 150ms Upload on GPRS 36

  37. Next steps  Model traffic shaping and scheduling policies used in different cellular data networks  Optimize TCP for these conditions  Release application development framework for Android  Collect user data on WiFi mobility and content access patterns to determine delivery latencies and usability insights 37

  38. Key messages  Content based network architectures can improve performance in today’s Internet usage context  Semantic metadata  Social networking websites  Challenges present themselves at different layers  Architecture appropriateness  Prediction algorithms for pre-fetching  Tracking algorithms for event detection  Application development framework  Optimized transport layers Thanks for listening!

  39. 39

  40. Spread occurs to countries with followers in that country LDA for topic identification, CF and follower-count for recommendation on country similarity 40

Recommend


More recommend