High- -Throughput Sketch Update on a Throughput Sketch Update on a High Low - -Power Stream Processor Power Stream Processor Low Yu- -Kuen Lai Kuen Lai Greg Byrd Yu Greg Byrd Dept. of Electrical Engineering Dept. of Electrical and Computer Engineering Dept. of Electrical Engineering Dept. of Electrical and Computer Engineering Chung Chung- -Yuan Christian Univ. Yuan Christian Univ. Center for Embedded Systems Research Center for Embedded Systems Research Chung- Chung -Li, Taiwan Li, Taiwan NC State University NC State University Raleigh, NC Raleigh, NC Outline Outline � Motivation � Data Stream Processing & I magine Stream Architecture � The Sketch Data Structure � I mplementation and Performance � Conclusion & Future Work ANCS 2006 ANCS 2006 2
Motivation Motivation [Slide from “ Querying and Mining Data Streams: You Only Get One Look”, VLDB’02, by Garofalakis et al.] � A growing number of applications are operated base on streams of data • Traffic measurement & analysis for � Infrastructure planning � Capacity forecasting accounting � Security related � Application characteristics • Massive volumes of data (several terabytes) • Data arrives at a rapid rate � How to process queries and compute statistics on data streams in real-time? ANCS 2006 3 ANCS 2006 Data Stream Model Data Stream Model � A data stream is a massive sequence of elements φ = • An input stream arrives sequentially, item by item ( a 1 a , ,...) 2 a = ( , ) • Each item consists of a key and an update k u t t t { } ∈ − � key space 0 ,..., 1 k t n � Update, • A time varying signal A[ ] (an array of n buckets) • The arrival of each item cause the signal A[] to be updated = + [ ] [ ] A k A k u t t t u t k n t A [ ] ANCS 2006 ANCS 2006 4
What Do We Need? What Do We Need? � Fast Counter Update • A smaller but faster SRAM � Being Programmable • Accommodate different algorithms • Different threshold/approximate accuracy for different applications � Approximated Answers • May not be able to produce exact answers due to limited resources (memory and processing power) � High Computation Power • The statistic operations are computational intensive ANCS 2006 5 ANCS 2006 I magine Stream Processor I magine Stream Processor � Built by Dr. William J. Dally’s research team at Stanford University � Designed for stream media processing � A VLI W programmable co- processor supports stream programming model in SI MD fashion. � Has 3 Levels of memory hierarchies � Key modules are organized around the Stream Register File (SRF) through stream buffers ANCS 2006 ANCS 2006 6
Stream Processing Stream Processing � What is Stream? • An important data representation in stream programming model • A collection of data records of variable length. • Streams are inputs to kernels where computation is performed on its elements. Stream Stream www.etplanet.com ANCS 2006 7 ANCS 2006 Programming Model Programming Model � Kernel Level • Computations are done locally within the kernel • Operates on streams as inputs and produces streams as outputs • No arbitrary memory reference • Loops are the ONLY control flow operations • Conditional= Predicates � Stream Level • Run at the Host processor • Kernel invocation • Orchestrating the flow of streams ANCS 2006 ANCS 2006 8
Some Performance Highlights Some Performance Highlights � AES (2Gbps) � Advanced Encryption Standard, Rijndael � I n ECB and OCB modes with key agility � Our best/ worst case is 32/ 76 cycles, 41 cycles for the variable sized packet (AI X- 1054837521-1) � MMH Message Authentication (7Gbps) � Multilinear Modular Hashing � Achieving Multi-Gigabit throughput. Ultra fast and unconditionally secure. � No matter how much computing power the adversary has. The probability for the adversary to compromise is lower than a probability p . � Bloom filter based Content I nspection Engine (400Mbps) • Matching 2000+ signatures up to 400 Mbps with 500Mhz system clock. (1500 bytes packet) • The false positive error rate is 9.8e-7 ANCS 2006 9 ANCS 2006 The Count- -Min Sketch Min Sketch The Count [Cormode & Muthukrishnan, Dec 2003] � I t’s a probabilistic, approximated algorithm � The BEST existing sketch scheme [2005 G. M. Lee et al.] � The operation is based on a two-dimensional array, count[d][w] count z -bits H 1 (k) H 2 (k) H d (k) w [ 1 ][ ] [ 2 ][ ] d [ ][ ] count count count d ANCS 2006 ANCS 2006 10
The Count- -Min Sketch (Cont.) Min Sketch (Cont.) The Count Update � a k = ( , ) For each arrived item k u � Hash the key by d different z -bits independent hash functions � Use those d hash values as u indexes to update the value into each array + = count [ j ][ h ( k )] u j ≤ j ≤ 1 d H 1 (k) H 2 (k) H d (k) Point Query Q(k) w � Given a key k , again we hash the key � Use these hash values as indexes to look up the value stored in each array � The answer to Q(k) is to pick count [ 1 ][ ] count [ 2 ][ ] count d [ ][ ] the minimum of all these d d values { } k = ˆ min [ ][ ( )] a count j h k j j ≤ j ≤ 1 d ANCS 2006 11 ANCS 2006 Many Applications [ Many Applications [ Cormode Cormode & & Muthu Muthu, SI AM I CDM05] , SI AM I CDM05] � Significant differences [Charikar et al, I CALP’02] and relative changes [Cormode & Muthu I NFOCOM’04] � Anomaly Detections [Krishnamurthy et al, SI GCOMM’03] � Heavy hitters [Cormode & Muthu ACM PDS 03] � Top-K items [Manku & Motwani I CDM’05] � Estimating frequent items [Manku & Motwani I CVLDB’02] ANCS 2006 ANCS 2006 12
Change Detection Change Detection � Monitoring the significant differences in traffic attributes over two observing intervals � Attributes of interest • Number of packets • Flows • Total bytes � Typical Approaches • Brute Force (store & sort) • Sampling • Sketch ANCS 2006 13 ANCS 2006 Sketch- -based Change Detection based Change Detection Sketch � Absolute difference based on K-ary sketch by Krishnamurthy et al. � K-ary sketch • Same sketch update process • Require 4-universal hash function • Different point query procedure � Three major modules: • Sketch update • Forecasting • Detection Sketch Update Forecasting Detection So(t+1) So(t+2) keys Sf(t+1) Sf(t+2) Keys So(t) Kernel B Se(t+2) Kernel A Kernel C Forecast Error Sketch Observed Sketch Sketch Sf(t+2) So(t-w+2) Alarms Time ANCS 2006 ANCS 2006 14
Sketch- -based Change Detection based Change Detection Sketch (cont.) (cont.) ANCS 2006 15 ANCS 2006 The Update Module The Update Module Sketch Sketch- -based Change Detection (cont.) based Change Detection (cont.) � Same update procedure as that in Keys So(t) Kernel A the CM sketch Observed Δ Sketch � Updating keys for a time interval t � The update has to be quick enough to process the incoming packets at line rate • For every packet z -bits � Key extraction � Hashing � Updating the counters H 1 (k) H 2 (k) H d (k) • It’s 32ns for processing a minimum- w sized IP packet at 10Gbps count [ 1 ][ ] count [ 2 ][ ] count d [ ][ ] d ANCS 2006 ANCS 2006 16
The Forecast Module The Forecast Module Sketch Sketch- -based Change Detection based Change Detection (cont.) Initial State (cont.) So(t) � Forecast model based on the So(t-1) moving average Kernel B1 � The forecast sketch Sf(t+1) − w 1 1 ∑ + = − ( 1 ) ( ) S t S t i f o So(t-w+1) w = i 0 � Compute the forecast sketch incrementally 1 + = + + + − − + ( 2 ) ( 1 ) ( ( 1 ) ( 1 )) S t S t S t S t w f f o o w So(t+1) Steady State Sf(t+1) Sf(t+2) Kernel B2 So(t-w+2) ANCS 2006 17 ANCS 2006 Time t+2 The Change Detection Module The Change Detection Module Sketch Sketch- -based Change Detection (cont.) based Change Detection (cont.) � Alarm threshold T A 1 = ⋅ [ _ ( ( ))] 2 T T Estimate F S t 2 A e = − ( ) ( ) ( ) S t S t S t e o f = h _ ( ( )) { } Estimate F S t median F i ∈ 2 2 e i H 1 k ∑ = − h 2 2 ( [ ][ ]) ( ( )) F i T i j sum S 2 − − s 1 1 k k ∈ | | j k � Given a key, Raise the alarm if > ( ( ), ) Estimate S t key T e A − [ ][ ( )] ( ) / T i h key sum S k = i ( ( ), ) { } Estimate S t key median ∈ | | − e i H 1 1 / k ANCS 2006 ANCS 2006 18
The Bottleneck The Bottleneck Sketch- -based Change Detection (cont.) based Change Detection (cont.) Sketch � The sketch processing (forecasting and Detection) Δ t is based on a time interval • Usually the interval is set in minutes, say 1 or 5 minutes Δ t • Kernel B and C run once per � The bottleneck is in the Sketch update module ANCS 2006 19 ANCS 2006 Sketch Update Performance Sketch Update Performance � For hashing and update a 32-bit key • 15 cycles (2-universal) • 33 cycles (4-universal) � I t’s 10.6Gbps and 4.8Gbps for processing 40-byte packets with system clock of 500 MHz ANCS 2006 ANCS 2006 20
Recommend
More recommend