information centric networking icn for delivering big
play

Information Centric Networking(ICN) for Delivering Big Data with - PowerPoint PPT Presentation

Information Centric Networking(ICN) for Delivering Big Data with Persistent Identifiers(PID) Andreas Karakannas Research Project 2 Supervised by: Zhiming Zhao Background PIDs in IP Network Information Centric Networking A new network


  1. Information Centric Networking(ICN) for Delivering Big Data with Persistent Identifiers(PID) Andreas Karakannas Research Project 2 Supervised by: Zhiming Zhao

  2. Background PIDs in IP Network Information Centric Networking A new network concept  User at Web Browser 4 Based on the idea that users are  interested in accessing Digital Objects regardless of their locations. No end-to-end communication  Digital Objects are uniquely identified  3 1 http://www.resolver.org/<PID> Request for Objects are routed based on  the Digital Object unique name (NO IP ROUTING!!!) Objects are cached in the path from  source to destination(In-Network Caching). www.resolver.org In-Network Caching aims to achieve  efficient & reliable distribution of the contents among the network infrastructure. PID: ark:12345/CIA/DNS_1.pdf 2 URL: https://www.os3.nl/_media/2013- 2014/courses/cia/dns_1.pdf

  3. Research Questions  How can PID types be mapped/resolved to ICNs’ Object Identifiers?  What is the efficiency of ICNs’ caching algorithms for delivering Big Data?

  4. Approach  Theoretical Studies on latest ICN Projects and PID Standards.  Propose Mapping Architecture Design based on the Theoretical study  Evaluate In-Network Caching Performance for Big Data Objects

  5. ICN approaches Theoretical Studies ICN Approaches  A Survey of Information-Centric Networking Research

  6. ICN approaches Theoretical Studies ICN Approaches  A Survey of Information-Centric Networking Research

  7. Named Data Networking(NDN)  The most mature ICN approach.  The only approach with published specification.(Packet Theoretical Format 0.1a2 published on March 27,2014). Studies ICN Approaches NDN  Most research in caching algorithms in ICN is based on NDN.  Only one with available open source simulators(ndnSIM,ccnSIM) for evaluating caching performance under different scenarios.

  8. Named Data Networking(NDN)  Names in NDN ◦ Based on URI syntax ◦ Have hierarchical structure (e.g. /NL/Amsterdam/UVA/ComputerScience/OS3/CIA/DNS.pdf) Theoretical Studies ◦ Names can be anything: a pdf file, a video, an endpoint, a command to ICN turn on some lights. Approaches NDN ◦ Names are used in the Routing procedure.  2 Types of packets ◦ INTEREST(request) packets Contains the Name of the Request  e.g. INTEREST(/NL/Amsterdam/UVA/ComputerScience/OS3/CIA/DNS.pdf) ◦ DATA(answer) packets Contains the Name of the Request & the Data  e.g. DATA(NL/Amsterdam/UVA/ComputerScience/OS3/CIA/DNS.pdf, <DATA>)

  9. Named Data Networking(NDN) Theoretical Studies NDN Populating the Name Prefix

  10. Named Data Networking(NDN) Theoretical Studies NDN Populating the Name Prefix

  11. Named Data Networking(NDN) Theoretical Studies NDN Populating the Name Prefix

  12. Named Data Networking(NDN) Theoretical Studies NDN Populating the Name Prefix

  13. Named Data Networking(NDN) Theoretical Studies NDN Routing the INTEREST packet

  14. Named Data Networking(NDN) Theoretical Studies NDN Routing the INTEREST packet

  15. Named Data Networking(NDN) Theoretical Studies NDN Routing the INTEREST packet

  16. Named Data Networking(NDN) Theoretical Studies NDN Routing the DATA packet

  17. Named Data Networking(NDN) Theoretical Studies NDN Routing the DATA packet

  18. Named Data Networking(NDN) Theoretical Studies NDN Routing the DATA packet

  19. Named Data Networking(NDN) Theoretical Studies NDN Cache HIT

  20. Named Data Networking(NDN) Theoretical Studies NDN Cache HIT

  21. Persistent Identifiers(PIDs)  A name with specific syntax that uniquely identifies an object for a long- lasting period regardless of its’ location and lifespan. Theoretical Studies  Different PID types are available for naming digital PID objects.  Each PID has three parts: PID PID Type Authority Name of Dig. Object Unique Identifier of the A Unique Identifier of the A Unique Identifier of the PID Type(e.g.urn:,ark: ) Authority(e.g. isbn,ietf) Digital Object (e.g. 0-7645-2641-3) Further Delegation to sub-Authorities is possible Example : urn : isbn : 0-7645-2641-3

  22. Persistent Identifiers(PIDs) Most-well known PID Types PID PID Type Authority Name Theoretical Studies Types Identifier PID Standards URL url: <protocol><host>:<port> [/<path>[?<searchpart>]] URN urn: <NID>: <NSS> ARK ark: <NAAN> /”<Name>[<Qualifier>] HANDLE handle: <Handle Naming Authority> /<Handle Local Name> PURL purl: <protocol><resolver /<name> address> DOI doi: 10.<Naming Authority> /<doi name syntax>

  23. Mapping Architecture Design Goals  Generic Mapping  Extensible Architecture Design  Scalable  Easy to Implement, Manage & Administrate

  24. Mapping Architecture Name-Space Implementation <Root PID NDN Name> Root PID Root PID Layer Server . . . . . . . . . URN Handle Doi Ark PID Type Layer . . . . 56789 Authority PID . . . . . . . . . . . . . 12345 IETF ISBN Layer (Further Delegation is Possible) PID NDN-Name urn:isbn:0-7645-2641-3 /UvA/NaturalScience/CS/CIA/DNS.pdf . . . .

  25. Iterative Resolution of PIDs to NDN names Client 1 2.INTEREST(<Root_PID_Server NDN Name><PID>) Root PID Server <Root PID Server NDN Name> 3.DATA(<Root PID Server NDN Name><PID>,<Answer>) Clients’ PID User Interface Resolver 4.INTEREST(<PID Type Server NDN Name> <PID>) Server PID Type Server <PID <PID Type Server NDN Resolver Name> 5.DATA(< PID Type Server NDN Name> <PID>,<Answer>) NDN Name> 1.INTEREST(<PID_Resolver Authority PID Server NDN Name><PID>) 6.INTEREST(<Authority PID Server NDN Name ><PID>) <Authority PID Server NDN Name> 7.DATA(<Authority PID Server NDN Name ><PID>,<Answer>) 8.DATA(<PID Resolver NDN Name><PID><Answer>) NDN 9.INTEREST(<PIDs’ NDN Name>) 9.INTEREST(<PIDs’ NDN Name>) CONTENT ROUTER 10.DATA(<PIDs’ NDN Name >,<Data>) 10.DATA(<PIDs’ NDN Name >,<Data>)

  26. Caching Strategies  Decision Algorithms(DA) Evaluate In- Which Content Router caches what? Network Caching Performance LCE,LCD,FIX(P),ProbCache  Replacement Algorithm(RA) How are Content Routers replaced Objects in the Content Store? FIFO,RANDOM,LRU,LFU

  27. Simulation Parameters Parameter Description Values R Big Data Repository Size 51.2TBytes |R| Num. of Big Data Objects in R 150 B Size of Big Data Object 350GBytes Big Data c Num. of sub-Objects a Big Data [1,2,4,6..20] Evaluate In- Object is consisted of Network Repository a 1 Popularity of Big Data sets is Caching based on Zipf Distribution: Performance P(x=i)=(1/i^a)/C |𝑺| C = 𝟐/𝒋^𝒃 𝒋=𝟐 Parameter Description Values C The Content Store Size in each [0.5B,1B,2B,4B,8B,16B] Content Router expressed as Size of a Big Data Object CA Caching Algorithm [LCE,LCD,FIX(0,5),FIX (0.25),ProbCache] RA Replacement Algorithm LRU Parameter Description Values T Indicated the number of Requests - for a Big Data Object the Client has CLIENT send so far

  28. Network T opologies String 4 0 3 2 1 Evaluate In- Network 4 Binary Tree Caching In both Network Performance Topologies the distance between the client and the Big Data Repository 3 is 4 Hops(Content 2 2 Routers) 1 1 1 1 0 0 0 0 0 0 0 0

  29. Performance Metrics In ICN the in-network caching aims to: • From the Customer point of view: Evaluate In- Reduce the average time required to download the requested Network Caching content. Performance • From the Publisher point of view: Reduce the number of requests the publisher needs to serve. • From the Network point of view Reduce the network traffic. Average Number of Hops per simulation describes all the above benefits.

  30. Collection of Measurements Collection of the Average number of Hops for each simulation starts when the Average Number of Hops converges for at least 50T. 4,2 Evaluate In- 4 3,8 Network 3,6 Caching 3,4 Performance 3,2 3 Average Number of Hops 2,8 2,6 LCE 2,4 Fix(0.5) 2,2 2 Fix(0.25) 1,8 ProbCache 1,6 1,4 LCD 1,2 No Cache 1 0,8 0,6 0,4 0,2 0 1 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304 320 336 352 368 384 400 416 432 448 464 480 496 512 528 T-Clients Requests

  31. Results : String Network (1 Client) 4,2 Indicates the Standard 4,0 Deviation for different 3,8 c values[1,2,4..20]. 3,6 c : The number of sub- Objects a Big Data 3,4 Object is consisted of. 3,2 Average Hops 3,0 2,8 2,6 2,4 LCE 2,2 Fix(0.5) 2,0 Fix(0.25) 1,8 ProbCache 1,6 LCD 1,4 0,5 1 2 4 8 16 Content Router Cache Size/Big Data Object Size (C:B) • Number of sub-Objects (c) a Big Data Object is consisted of has neglectable impact on the performance of caching algorithms. • C:B ≤ 1 Low Caching Algorithms Performance • C:B ≥ 2 Significant Benefits can be gained from this point and onwards.

Recommend


More recommend