with FIB Explosion Tong Yang(ICT), Gaogang Xie(ICT), Yanbiao - PowerPoint PPT Presentation

Performance Issue in IP Lookup  FIB increasing: 15% per year; FIB size: 512,000  512k bug : In 2014.8, Cisco says that web browsing speeds could slow over the next week as old hardware is upgraded to handle the 512K FIB. FIB 512k now 2

Motivation  On-chip vs. Off-chip memory. 10 times faster, but limited in size.  With FIBs increasing, for almost all packets Constant yet fast Constant yet small + lookup speed: footprint for FIB: Low Time Complexity On-chip Memory 3 Ideal IP Lookup Algorithm

State-of-the-art  Achieving constant IP lookup time – TCAM-based – Trie pipeline using FPGA – full-expansion – DIR-24-8  Achieving small memory – Based on Bloom Filter – Level compression, path compression – LC-trie How to satisfy both constant lookup time and small on-chip memory usage? 4

SAIL Framework  Observation: almost all packets hit 0~24 prefixes  Two Splitting – Splitting lookup process – Splitting prefix length Finding prefix length Finding next hop Prefix length 0~24 On-chip Off-chip Prefix length 25~32 Off-chip Off-chip 5

Splitting Original trie Bitmap arrays Next hop arrays 1 6 Level 0~24 1 0 1 1 8 0 3 1 Short prefixes … 0 1 … 0 3 0 0 1 1 0 0 9 2 Level … … 1 0 1 1 1 1 3 0 4 5 1 1 25~32 … 1 0 7 1 … 1 0 1 1 1 1 2 1 Long prefixes 24 Bit Maps 0-24 2 𝑗 = 4𝑁𝐶 On-Chip 𝑗=0 How to avoid searching both short and long prefixes? 6

Pivot Pushing & Lookup Pivot push: Lookup 001010 FIB Trie Bit maps Pivot level: 4 level 0 B 0 next- 6 1 prefix hop B 4 [001010 >> 2] = 1 level 1 */0 6 B 1 4 0 1 1*/1 4 N 4 [2] = 0 level 2 B 2 3 0 1 0 0 01*/2 3 O B 3 level 3 001*/3 3 3 7 0 1 0 0 0 0 0 1 111*/3 7 B C A B 4 … 1 8 0 0 1 1 0 0 1 0 0011*/4 1 E D F H N 4 … 0 0 0 1 0 0 0 0 1110*/4 8 3 3 2 8 11100*/5 2 N 3 0 3 0 0 0 0 0 7 G 9 001011*/6 9 long prefix (a) (b) (c) 7

Update of SAIL_B Insert 10* FIB Trie Bit maps B 2 [10]=1 level 0 B 0 next- 6 1 prefix hop level 1 */0 6 delete111* B 1 4 0 1 B 3 [111]=0 1*/1 4 level 2 B 2 3 1 0 1 0 0 01*/2 3 O B 3 level 3 001*/3 3 0 3 7 0 1 0 0 0 0 0 1 111*/3 7 B C A B 4 … 1 8 0 0 1 1 0 0 1 0 0011*/4 1 E D F H N 4 … 0 0 0 1 0 0 0 0 1110*/4 8 3 3 2 8 11100*/5 2 N 3 0 3 0 0 0 0 0 7 G 9 001011*/6 9 (a) (b) (c) changing 001*, or inserting 0010* only need to update off-chip tables 8

Optimization  SAIL_B – Lookup: 25 on-chip memory accesses in worst case – Update: 1 on-chip memory access  Lookup Oriented Optimization (SAIL_L) – Lookup: 2 on-chip memory accesses in worst case – Update: unbounded, low average update complexity  Update Oriented Optimization (SAIL_U) – Lookup: 4 on-chip memory accesses in worst case – Update: 1 on-chip memory access  Extension: SAIL for Multiple FIBs (SAIL_M) 9

SAIL_L Y If B16==1 N N16 Level 16 If B24==1 Y N N24 N32 Level 24 Level 32 10

SAIL_U • Pushing to levels 6, 12, 18, and 24. Level 6 • One update at most affects 2^6= 64 bits in the bitmap array. Level 12 Level 18 Still at most one on-chip memory access is enough for each update. Level 24 11

SAIL_M A: 00* A: 00* A:00* Trie 2 Overlay Trie Trie 1 C: 10* B: 01* C:10* G: 110* E: 100* E:100* F: 101* G: 110* + H: 111* D D A A C A B C C G E H E F G (a) (b) (c) 12

SAILs in worst case On-Chip Lookup Update Memory (on-chip) (on-chip) SAIL_B = 4MB 25 1 ≤ 2.13MB SAIL_L 2 Unbounded ≤ 2.03MB SAIL_U 4 1 ≤ 2.13MB SAIL_M 2 Unbounded Worst case: 2 off-chip memory accesses for lookup 13

Implementations  FPGA: Xilinx ISE 13.2 IDE; Xilinx Virtex 7 device; On- chip memory is 8.26MB – SAIL_B, SAIL_U, and SAIL_L  Intel CPU: Core(TM) i7-3520M 2.9 GHz; 64KB L1, 512KB L2, 4MB L3; DRAM 8GB – SAIL_L and SAIL_M  GPU: NVIDIA GPU (Tesla C2075, 1147 MHz, 5376 MB device memory, 448 CUDA cores), Intel CPU (Xeon E5- 2630, 2.30 GHz, 6 Cores). – SAIL_L  Many-core : TLR4-03680, 36 cores, each 256K L2 cache. – SAIL_L 14

Evaluation  FIBs – Real FIB from a tier-1 router in China – 18 real FIBs from www.ripe.net  Traces – Real packet traces from the same tier-1 router – Generating random packet traces – Generating packer traces according to FIBs  Comparing with – PBF [sigcomm 03] – LC-trie [applied in Linux Kernel] – Tree Bitmap – Lulea [sigcomm 97 best paper] 15

FPGA Simulation SAIL_L PBF On-chip memory usage 1.2MB 1.0MB 800.0kB 600.0kB 400.0kB 200.0kB 0.0B rrc00rrc01rrc03rrc04rrc05rrc06rrc07rrc10rrc11rrc12rrc13rrc14rrc15 FIB SAIL Algorithms Lookup Speed Throughput SAIL_B 351Mpps 112Gbps SAIL_U 405Mpps 130Gbps SAIL_L 479Mpps 153Gbps 16

Intel CPU: real FIB and traces LC-trie TreeBitmap Lulea SAIL_L 800 700 Lookup speed (Mpps) 600 500 400 300 200 100 0 1 2 3 4 5 6 7 8 9 10 11 12 FIB 17

Intel CPU: 12 FIBs using prefix-based and random traces 500 Prefix-based traffic Random Trace 400 Lookup speed (Mpps) 300 200 100 0 2 3 4 5 6 7 8 9 10 11 12 # of FIBs 18

Intel CPU: Update # of memory accesses per update rrc00 14 average of rrc00 12 rrc01 average of rrc01 10 rrc03 average of rrc03 8 6 4 2 0 9 19 29 39 49 59 69 79 89 99 109 119 129 139 149 159 169 179 189 199 209 219 229 239 249 259 269 279 289 299 309 319 # of updates (*500) 19

GPU: Lookup speed VS. batch size 650 30 600 60 550 90 500 Lookup speed (Mpps) 450 400 350 300 250 200 150 100 50 0 rrc00 rrc01 rrc03 rrc04 rrc05 rrc06 rrc07 rrc10 rrc11 rrc12 rrc13 rrc14 rrc15 FIB 20

GPU: Lookup latency VS. batch size 240 30 220 60 90 200 Latency (microsecond) 180 160 140 120 100 80 60 40 20 0 rrc00 rrc01 rrc03 rrc04 rrc05 rrc06 rrc07 rrc10 rrc11 rrc12 rrc13 rrc14 rrc15 FIB 21

Tilera GX-36: Lookup VS. # of cores 700M 600M Lookup speed (pps) 500M 400M 300M 200M 100M 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 # of cores 22

Conclusion  Two-dimensional Splitting Framework: SAIL  Three optimization algorithms – SAIL_U, SAIL_L, SAIL_M – Up to 2.13MB on-chip memory usage – 2 off-chip memory accesses  Suitable for different platforms – FPGA, CPU, GPU, Many-core – Up to 673.22~708.71 Mpps  Future work: SAIL to IPv6 lookup 23

Source codes of SAIL, LC-trie, Tree Bitmap, and Lulea http://fi.ict.ac.cn/firg.php?n=PublicationsAmpTalks.OpenSource 24

Thanks Thanks http://fi.ict.ac.cn

with FIB Explosion Tong Yang(ICT), Gaogang Xie(ICT), Yanbiao - PowerPoint PPT Presentation

Guarantee IP Lookup Performance with FIB Explosion Tong Yang(ICT), Gaogang Xie(ICT), Yanbiao Li(HNU), Qiaobin Fu(ICT) Alex X. Liu(MSU), Qi Li(ICT), Laurent Mathy(ULG) Performance Issue in IP Lookup FIB increasing: 15% per year; FIB size:

Basic Analysis of Algorithms Curt Clifton Rose-Hulman Institute of Technology Recursive

Ordered FIB Updates draft-francois-ordered-fib-01.txt Pierre Francois Olivier Bonaventure Mike

Efficiency Announcements Measuring Efficiency Recursive Computation of the Fibonacci Sequence

Announcements Efficiency Recursive Computation of the Fibonacci Sequence Our first example of

A Cs+ Ion Source for FIB and SIMS Featuring FIB:RETRO and SIMS:ZERO AV Steele, zeroK B

A novel technique to improve the quality of ex-situ lift-out FIB foils Anja Schreiber and Richard

A CAD/CAM approach for layer- -based FIB processing based FIB processing A CAD/CAM approach for

Intro to Focused Ion Beam (FIB) Chip Circuit Edit Chris Kang (Presenting) Steve Herschbein,

LECTURE 2 Python Basics MODULES ''' Module fib.py ''' So, we just put together our first def

In-Situ Combination of TOF-SIMS and EDS Analysis During FIB Sectioning V. Ray 1 , E. Principe 2 ,

FIB:RETRO and SIMS:ZERO Adam V. Steele adam@zeroK.com Tech Status: Low Temperature Ion Source

The context : bioimage data explosion High-throughput imaging techniques have led to bioimage

Dirac neutrino magnetic moment and the dynamics of a supernova explosion Alexander Okrugin

Fighting the Clock Explosion Oded Maler CNRS-VERIMAG Grenoble, France September 2006 Fighting

WELCOME ABOARD! Thanks so much for getting Tube Profit Explosion, youre pretty cool! TUBE

Securing Caribbean networks Bevil Wooding Executive Director, CaribNOG THE DIGITAL WORLD

TITRE L. le Coq, J.M. Floc'h, A. Sharaiha, P. Besnier, O. Lafond, M. Himdi, R. Gillard

Isotopic distribution and dependency to fission product kinetic energy for 241 Pu thermal

Introduction to HTML (adapted from Laurent Falquet) 2005 VI, March 2005 Page 1 Outline Tags,

A Preliminary Study of the Impact of Software Engineering on GreenIT A URLIEN B OURDON A DEL N

Landscape of Including Vulnerable Populations in Pragmatic Clinical Trials Mary Jane Welch DNP,

Considerations of Lateral Lumbar Interbody Fusion (LLIF) in Spinal Deformity Surgery Robert K.

Intro to Microarchitectural Atacks Thomas Eisenbarth 12.06.2018 Summer School on Real-World

Stable Marriage Problem Stable Marriage Problem Small town with n boys and n girls. Stable