Charging f rom Sampled Net work Usage Nick Duf f ield Carst en Lund Mikkel Thorup AT&T Labs-Research, Florham Park, NJ 1
Do Charging and Sampling Mix? J Usage sensit ive charging � charge based on sampled net work usage J I s sampling necessary? � j ust count all packet s/ byt es in net work? � measure and export all t raf f ic f lows st at s? J I s sampled usage reliable enough? � risk of overcharging or undercharging 2
Why usage-sensit ive charging? J Compare charging on port -size � coarse granularit y OC3 ⇒ OC12 ⇒ OC48 ⇒ 0C192 J I mplicit resource management � price disincent ive t o greedy use J Dif f erent iat ed services � will require dif f erent iat ed charges 3
Fine count all packet s/ byt es in net work? J Mirror pricing policy in rout er conf igurat ion? � separat e count er f or each billable packet st ream J Scaling/ dimensionalit y issues � pot ent ially many det erminant s t o pricing – ToS, applicat ion t ype, source/ dest I P addr ess, … � rout ers must support large number of count ers J Conf igurat ion issues � change pricing policy ⇒ reconf igure count ers – administ rat ive cost 4
I P Flow Abst ract ion flow 4 flow 1 flow 2 flow 3 J I P f low abst ract ion � set of packet s ident if ied wit h “same” address, port s, et c. � packet s t hat are “close” t oget her in t ime � possible prot ocol-based f low demarcat ion – e.g. t erminat e on TCP FI N J I P f low summaries � report s of measured f lows f rom rout ers – f low ident if iers, t ot al packet s/ byt es, rout er st at e J Several f low def init ions in commercial use 5
Measure/ Export All Traf f ic Flows? J Measure t raf f ic f lows as t hey occur � export f low summaries t o billing syst em J Flow volumes � one OC48 ⇒ several GB f low summaries per hour J Cost � net work resources f or t ransmission � st orage/ processing at billing syst em 6
Flow Sampling? J Sampling � st at ist icians ref lex act ion t o large dat aset s J Export select ed f lows � reduce t ransmission/ st orage/ processing cost s J Suf f icient ly accurat e f or pricing? � risk of overcharging ( ⇒ irat e cust omers) � risk of undercharging ( ⇒ irat e shareholders) 7
Packet Sampling and Flow Sampling J Packet Sampling � when rout er can’t f orm f lows at line rat e – scaling at a single rout er J Flow sampling � managing volume of f low st at ist ics – scaling across downst ream measurement inf rast ruct ure J Complement ary � could combine – e.g. 1 in N packet sampling + f low sampling 8
Usage Est imat ion J Each f low i has � “size” x i – byt es or packet s � “color” c i – combinat ion of I P address, port , ToS et c t hat maps t o billable st ream ( = cust omer + billing class) J Goal � t o est imat e t ot al usage X(c) in each color c ∑ = X(c) x i = i : c c i 9
Basic I deas J Mat ch sampling met hod t o f low charact erist ics � high f ract ion of t raf f ic f ound in small f ract ion of long f lows – sample long f lows more f requent ly t han short f lows G large cont ribut ions t o usage more reliably est imat ed J Manage sampling error t hrough charging scheme � make charging insensitive t o small usage – sampling error f or small usage not ref lect ed in charge t o user J Trade-of f � allow small consist ent undercount t o reduce risk of overcharge J Show how t o relat e sampling and charging paramet ers � simple rules t o achieve desired accuracy 10
Size independent f low sampling bad J Sample 1 in N f lows � est imat e t ot al byt es by N t imes sampled byt es J Problem: � long f low lengt hs – est imat e sensit ive t o inclusion or omission of a single large f low 11
Size dependent f low sampling J Sample f low summary of size x wit h prob. p(x) J Est imat e usage X by ∑ = x X' p(x) sampled f lows � boost up size x by f act or 1/ p(x) in est imat e X’ – compensat e against chance of being sampled J Chose p(x) t o be increasing in x � longer f lows more likely t o be sampled � compare size independent sampling: p(x) =1/ N 12
St at ist ical Propert ies J Fixed set of f low sizes {x 1 , x 2 , … ,x n } � we only consider randomness of sampling J X’ is unbiased est imat or of act ual usage X = S i x i � ˜ X’ = X: averaging over all possible samplings � holds f or all probabilit y f unct ions p(x) J Proof : � X’ = S i w i / p(x i ) – w i random variable G w i =1 wit h prob. p(x i ), 0 ot herwise – ˜ w i = p(x i ) hence ˜ X’ = ˜ S i w i x i / p(x i )= S i x i =X 13
What is best choice of p(x)? J Trade-of f accuracy vs. number of samples J Express t rade-of f t hrough cost f unction � cost = variance(X’) + z 2 average number of samples – paramet er z: relat ive import ance of variance vs. # samples J Which choice of p(x) minimizes cost ? J p z (x) = min { 1 , x/ z } � f lows wit h size ≥ z: always select ed � f lows wit h size < z: select ed wit h p z (x) prob. proport ional t o t heir size 1 J Trade-of f � smaller z – more samples, lower variance x z � larger z – f ewer samples, higher variance J Will call sampling wit h p z (x) “opt imal” 14
I mplement at ion J Nearly as simple as 1 in N sampling use f low size variabilit y as source of randomness � – no random number generat ors sample(x) { static count = 0 if (x > z) { select_flow } else { count += x if ( count > z) { count = count - z select_flow } } } 15
Opt imal Resampling z 1 z 2 z 3 Billing Aggregat ion Rout er Server Syst em J Resampling t o progressively t hin f low summaries J Finer resampling (z 1 ≤ z 2 ≤ z 3 ) preserves st at ist ics � f inal f low st ream at billing syst em has same st at ist ical propert ies as would original st ream sampled once wit h z 3 16
Opt imal vs. size independent sampling J Net Flow t races 1000’s cable users, 1 week � J Color f lows by cust omer-side I P address c � J Compare 1 in N sampling � opt imal sampling � – same average sampling rat e J Measure of accuracy weight ed mean relat ive error � ∑ − | X' (c) X(c) | c ∑ X(c) c J Heavy t ailed f low size dist ribut ion is our f riend! allows more accurat e encoding of usage inf ormat ion � 17
Charging and Sampling Error J Opt imal sampling � no sampling error f or f lows larger t han z J Exploit in charging scheme � f ixed charge f or small usage � usage sensit ive charge only f or usage above insensitivity level L J Charge according t o est imat ed usage f (X’(c)) = a + b max{ L , X’(c) } – coef f icient s a, b and level L could depend on color c J Only usage above L needs reliable est imat ion 18
Accuracy and Paramet er Choice J Given t arget accuracy � relat e sampling t hreshold z t o level L J Theorem � Variance(X’) ≤ z X (t ight bound) � now assume: z ≤ ε 2 L – St d.Dev. X’ ≤ ε X if X ≥ L G bound sampling error of est imat ed usage > L – St d.Dev. f (X’) ≤ ε f (X) G bound error of charge based on est imat ed usage J Bounds hold f or any f low sizes {x i } � no assumpt ion on f low size dist ribut ion – j ust choose z ≤ ε 2 L 19
Example J Target paramet ers � L = 10 7 , ε = 10% ⇒ z = 10 5 J Scat t er plot � rat io est imat ed/ act ual usage vs. act ual usage – each color c � observe bet t er est imat ion of higher usage J Want t o avoid 1+ ε = 1.1 � rat io > and L = 10 7 usage > J Less t han 1 in 1000 “bad” point s 20
Compensat ing variance f or mean J Aim: � reduce chance of overest imat ing usage J Met hod: � t heorem gave bound: Var(X’) ≤ z X � ant icipat e upwards variat ions in X’ by subt ract ing of f mult iples of st d. dev. – charge according t o X s = ' X' -s zX' � again: no assumpt ions on f low size dist ribut ion 21
Example: s=1 J Scat t er pushed down: � no point s wit h rat io> 1.1 and 10 7 usage > J Drawback � more unbillable usage – when X’ s < X J Small unbillable usage f or heavy users � rat io → 1 � St d.Dev.(X’)/ X’ vanishes as X grows 22
Example: s=2 J Scat t er pushed down f urt her: � no point s wit h rat io > 1 J Trade of f � unbillable usage vs. overest imat ion s unbill. X’ s > X? byt es 0 -0.1% 50% 1 3.1% 3% 2 6.2% 0% 23
How t o reduce unbillable usage? J Make sampling more accurat e � reduce z! η J For unbillable f ract ion < � chose s z ≤ η 2 L J Example: � s = 2, η = 10% � reduce z – f rom 10 5 t o 10 4 J Alt ernat ive � increase coef f icent a in charge f (X) t o cover cost s 24
Tension bet ween accuracy and volume J Want t o reduce z � bet t er accuracy, less unbillable usage J Drawback � increased sample volume J Solut ion � make billing period longer inst ead – usage roughly proport ional t o billing period – allows increased charge insensit ivit y level L � sample product ion rat e cont rolled by t hreshold z – rat e r Σ x f (x)p z (x) G f low arrival rat e r, f ract ion f (x) of f lows size x J Need only z = ε 2 L � larger L allows smaller error ε f or given z 25
Recommend
More recommend