Compressing IP Forwarding Tables: Towards Entropy Bounds and Beyond - PowerPoint PPT Presentation

Compressing IP Forwarding Tables: Towards Entropy Bounds and Beyond Revised on Feb 10, 2014 Gábor Rétvári, János Tapolcai, Attila K˝ orösi, András Majdán, Zalán Heszberger Budapest Univ. of Technology and Economics Dept. of Telecomm. and Media Informatics {retvari,tapolcai,korosi,majdan,heszi}@tmit.bme.hu SIGCOMM’13, August 12–16, 2013, Hong Kong, China

Motto IP forwarding table compression is boring . . . but compressed data structures are beautiful!

Encoding Strings • Suppose we want to encode the string “labanana” • Just 4 symbols, so we can use 2 bits per symbol symbol code a 00 l a b a n a n a b 01 10 00 01 00 11 00 11 00 l 10 n 11 • Size is information-theoretic limit: 16 bits • Fast access to symbol at any position, fast search, etc. • But this format is not particularly memory efficient

Huffman Coding • Compression by encoding popular symbols on fewer bits • Huffman tree sorted by symbol frequencies 0 1 a(4) 0 1 n(2) 0 1 b(1) l(1)

Huffman Coding • Compression by encoding popular symbols on fewer bits • Huffman tree sorted by symbol frequencies • Use tree-prefix as symbol code symbol code a 0 l a b a n a n a b 110 111 0 110 0 10 0 10 0 l 111 n 10 • Size is nH 0 bits, where n is length and H 0 is entropy • Only 14 bits, minimal for a zero order source • But no fast access to symbols, no search!

Wavelet Trees • Indexing and Huffman coding simultaneously • A bitmap at each node of the Huffman tree • Tells whether symbol belongs to the left/right branch labanana 10101010 lbnn aaaa 1100 lb nn 10 b l

Wavelet Trees: Access • Store bitmaps in succinct bitstring indexes (e.g., RRR ) • encode an n bit long bitmap on roughly n bits • support access/rank queries in O (1) • E.g., accessing the 3 rd position labanana 10101010 lbnn aaaa 1100 lb nn 10 b l

Wavelet Trees: Access • Store bitmaps in succinct bitstring indexes (e.g., RRR ) • encode an n bit long bitmap on roughly n bits • support access/rank queries in O (1) 1. “Which branch the 3 rd symbol belongs to?” labanana 10101010 access(10101010 , 3) = 1 lbnn aaaa 1100 lb nn 10 b l

Wavelet Trees: Access • Store bitmaps in succinct bitstring indexes (e.g., RRR ) • encode an n bit long bitmap on roughly n bits • support access/rank queries in O (1) 2. “How many symbols from this branch occurred this far?” labanana 10101010 rank 1 (10101010 , 3) = 2 lbnn aaaa 1100 lb nn 10 b l

Wavelet Trees: Access • Store bitmaps in succinct bitstring indexes (e.g., RRR ) • encode an n bit long bitmap on roughly n bits • support access/rank queries in O (1) 3. “Which branch this symbol belongs to?” labanana 10101010 lbnn aaaa 1100 access(1100 , 2) = 1 lb nn 10 b l

Wavelet Trees: Access • Store bitmaps in succinct bitstring indexes (e.g., RRR ) • encode an n bit long bitmap on roughly n bits • support access/rank queries in O (1) 4. “How many symbols from this branch occurred this far?” labanana 10101010 lbnn aaaa 1100 rank 1 (1100 , 2) = 2 lb nn 10 b l

Wavelet Trees: Access • Store bitmaps in succinct bitstring indexes (e.g., RRR ) • encode an n bit long bitmap on roughly n bits • support access/rank queries in O (1) 5. “Which of the remaining two symbols is the result?” labanana 10101010 lbnn aaaa 1100 lb nn 10 access(10 , 2) = 0 b l

Wavelet Trees: Access • Store bitmaps in succinct bitstring indexes (e.g., RRR ) • encode an n bit long bitmap on roughly n bits • support access/rank queries in O (1) 6. The 3 rd symbol is b labanana 10101010 lbnn aaaa 1100 lb nn 10 b l

Wavelet Trees: Size • We only store the bitmaps at each level labanana l a b a n a n a 10101010 111 0 110 0 10 0 10 0 lbnn 1100 lb 10 • Every symbol appears with its Huffman code • Size is nH 0 bits (plus negligible overhead) • But we still have efficient access

Compressed Data Structures • Compression not necessarily sacrifices fast access! • Store information in entropy-bounded space and provide fast in-place access to it • take advantage of regularity, if any, to compress • data drifts closer to the CPU in the cache hierarchy • operations are even faster than on the original uncompressed form • No space-time trade-off! • This paper: advocate compressed data structures to the networking community • IP forwarding table compression as a use case

IP Forwarding Information Base • The fundamental data structure used by IP routers to make forwarding decisions • Stores more than 440K IP-prefix-to-nexthop mappings as of January, 2013 • consulted on a packet-by-packet basis at line speed • queries are complex: longest prefix match • updated couple of hundred times per second • takes several MBytes of fast line card memory and counting • May or may not become an Internet scalability barrier

Prefix Trees • Tries are the most convenient way to store IP FIBs prefix label 2 -/0 2 0 0/1 3 3 2 0 1 00/2 3 3 2 001/3 2 1 1 01/2 2 2 1 3 2 2 1 011/3 1 Prefix-free trie Prefix tree FIB

FIB Space Bounds • A FIB can be uniquely represented by a binary prefix- free trie T • Let T have n leaves labeled from an alphabet of size δ with Shannon-entropy H 0 • The information-theoretic lower bound to encode T is 2 n + n log 2 δ bits • The zero-order entropy of T is 2 n + nH 0 bits • The tree structure imposes an additive term 2 n to the string size nH 0

Static Compressed FIBs: XBW-l • Apply the state-of-the-art in compressed data structures • convert FIB to prefix-free form • serialize the prefix tree into a set of strings • compress using wavelet trees and RRR • We call the resultant data structure XBW-l + realizes the zero-order entropy bound + in fact, also attains higher-order entropy + lookup goes in O (log n ) time – but update is linear – lookup is too slow for practical applications • Problem turns out that XBW-l is pointerless

Dynamic FIBs: Trie-folding • Practical FIB compression, a good old pointer machine • Fold the trie into a prefix DAG (DAFSA, DAWG, BDD) λ = 0 1 a 2 1 1 0 λ = 2 3 3 3 0 1 0 2 1 1 2 1 2 3 1 2 1 2 • For good compression, we need the tree to be in a prefix-free form • But prefix-free forms are expensive to update • Balance by a parameter λ , called the leaf-push barrier

Prefix DAG Size • View the problem as string compression: encode a string S into a prefix DAG D ( S ) 0 1 1 1 l b n a a a n a n a l b D ( S ) S • Theorem 1: D ( S ) needs 5 n log 2 δ bits at most • Theorem 2: D ( S ) can be squeezed into ∼ 7 nH 0 bits in expectation • Theorem 3: update goes in O ((1 + 1 / H 0 ) log n ) steps

Evaluation FIB N δ H 0 I E XBW-l pDAG µ taz 410K 4 1.00 94KB 56KB 63KB 178KB 3.17 access(d) 444K 28 1.06 206KB 90KB 100KB 369KB 4.1 • Entropy bound ( E ) is way smaller than information- theoretic limit ( I ): IP FIBs contain high regularity! • XBW-l attains entropy bounds very closely, with prefix DAGs ( pDAG ) off by only a factor µ of 2 – 4 • FIBs can be encoded on roughly 1 – 2 bits per prefix(!) • that’s roughly 100 – 400 KBytes of memory • Several million lookups per sec both in HW and SW • faster than the uncompressed form • pDAG tolerates more than 100 , 000 updates per sec

Conclusions • Compressed data structures are essential in information retrieval, computational biology, geometry, etc. • allow to sidestep notorious space-time trade-offs • as such, compressing comes essentially for free • FIB compression is a poster child of why the networking field is in a sore need of good compression methods • permits to reason about size, lookup, and update performance (analyzability) • allows to state theoretical storage size bounds (predictability) • faster operations than on the uncompressed form (efficiency)

Compressing IP Forwarding Tables: Towards Entropy Bounds and Beyond - PowerPoint PPT Presentation

Compressing IP Forwarding Tables: Towards Entropy Bounds and Beyond Revised on Feb 10, 2014 Gbor Rtvri, Jnos Tapolcai, Attila K orsi, Andrs Majdn, Zaln Heszberger Budapest Univ. of Technology and Economics Dept. of

Compressing IP Forwarding Tables: Towards Entropy Bounds and Beyond Gbor Rtvri, Jnos

Entropy, Relative Entropy, Cross Entropy Entropy Entropy, H(x) is a measure of the uncertainty of

Formal Modeling in Cognitive Science Lecture 25: Entropy, Joint Entropy, Conditional Entropy 1

Compressing IP Forwarding Tables for Fun and Profit Gbor Rtvri, Zoltn Cserntony, Attila

Entropy bounds and the holographic principle Raphael Bousso Berkeley Center for Theoretical

Lempel- -Ziv Ziv- -Welch (LZW) Welch (LZW) Lempel Data Compressing Model Data Compressing

Routing Process of distributing information through network so routers can build forwarding

Entropy Coding Definition of Entropy Three Entropy coding techniques: (taken from the

1) Entropy = measure of randomness 2) Entropy = measure of compressibility More random = Less

Chapter 2 Entropy, Relative Entropy, and Mutual Infor- mation Peng-Hua Wang Graduate Institute

Circuit Lower-bounds Lecture 24 Weak circuits are indeed weak 1 Circuit Lower-bounds 2

Product Transport & Shipping Options 1 DHL Logistics Cambodia | 2014 DHL Global Forwarding

TIMES TABLES HOW WE TEACH TIMES TABLES AND HOW YOU CAN HELP WHY ARE TIMES TABLES IMPORTANT?

NZ Data Tables Data tables sit alongside the Active NZ main report The data tables provide

Symbol tables COMP 520 Fall 2013 Symbol tables (2) Symbol tables are used to describe and analyse

Road detection via entropy By Anna Zaidman 1 1 What is entropy? Entropy is a mathematically

Balanced Mobiles Yassine Hamoudi , Sophie Laplante , Roberto Mantaci May 17, 2017 ENS Lyon

Compression CISC489/689010,Lecture#5 Monday,February23

I n f o r m a t i o n T r a n s m i s s i o n C h a p t e r 5 , S

Struktur Data & Algoritme ( Data Structures & Algorithms ) Tree: application Denny (

Chapter 5 Searching and Binary Search Trees 5.1 Searching sequence The purpose of searching :

Efficient Generation of Short and Fast Repeater Tree Topologies Christoph Bartoschek, Stephan

Lecture 17 Log into Linux. Copy two subdirectories in /home/hwang/cs375/lecture17/ $ cp -r

A PEEK INSIDE RIAK Steve Vinoski Basho Technologies Cambridge, MA USA http://basho.com

Compressing IP Forwarding Tables: Towards Entropy Bounds and Beyond - PowerPoint PPT Presentation

Compressing IP Forwarding Tables: Towards Entropy Bounds and Beyond Revised on Feb 10, 2014 Gbor Rtvri, Jnos Tapolcai, Attila K orsi, Andrs Majdn, Zaln Heszberger Budapest Univ. of Technology and Economics Dept. of

Compressing IP Forwarding Tables: Towards Entropy Bounds and Beyond Gbor Rtvri, Jnos

Entropy, Relative Entropy, Cross Entropy Entropy Entropy, H(x) is a measure of the uncertainty of

Formal Modeling in Cognitive Science Lecture 25: Entropy, Joint Entropy, Conditional Entropy 1

Compressing IP Forwarding Tables for Fun and Profit Gbor Rtvri, Zoltn Cserntony, Attila

Entropy bounds and the holographic principle Raphael Bousso Berkeley Center for Theoretical

Lempel- -Ziv Ziv- -Welch (LZW) Welch (LZW) Lempel Data Compressing Model Data Compressing

Routing Process of distributing information through network so routers can build forwarding

Entropy Coding Definition of Entropy Three Entropy coding techniques: (taken from the

1) Entropy = measure of randomness 2) Entropy = measure of compressibility More random = Less

Chapter 2 Entropy, Relative Entropy, and Mutual Infor- mation Peng-Hua Wang Graduate Institute

Circuit Lower-bounds Lecture 24 Weak circuits are indeed weak 1 Circuit Lower-bounds 2

Product Transport &amp; Shipping Options 1 DHL Logistics Cambodia | 2014 DHL Global Forwarding

TIMES TABLES HOW WE TEACH TIMES TABLES AND HOW YOU CAN HELP WHY ARE TIMES TABLES IMPORTANT?

NZ Data Tables Data tables sit alongside the Active NZ main report The data tables provide

Symbol tables COMP 520 Fall 2013 Symbol tables (2) Symbol tables are used to describe and analyse

Road detection via entropy By Anna Zaidman 1 1 What is entropy? Entropy is a mathematically

Balanced Mobiles Yassine Hamoudi , Sophie Laplante , Roberto Mantaci May 17, 2017 ENS Lyon

Compression CISC489/689010,Lecture#5 Monday,February23

I n f o r m a t i o n T r a n s m i s s i o n C h a p t e r 5 , S

Struktur Data &amp; Algoritme ( Data Structures &amp; Algorithms ) Tree: application Denny (

Chapter 5 Searching and Binary Search Trees 5.1 Searching sequence The purpose of searching :

Efficient Generation of Short and Fast Repeater Tree Topologies Christoph Bartoschek, Stephan

Lecture 17 Log into Linux. Copy two subdirectories in /home/hwang/cs375/lecture17/ $ cp -r

A PEEK INSIDE RIAK Steve Vinoski Basho Technologies Cambridge, MA USA http://basho.com

Product Transport & Shipping Options 1 DHL Logistics Cambodia | 2014 DHL Global Forwarding

Struktur Data & Algoritme ( Data Structures & Algorithms ) Tree: application Denny (