introduction metrics and review of basic statistics
play

Introduction Metrics and Review of Basic Statistics Metrics CS - PDF document

Introduction Metrics and Review of Basic Statistics Metrics CS 239 Why are we talking about statistics? Experimental Methodologies for Important statistics concepts System Software Indices of central tendency Peter Reiher


  1. Introduction Metrics and Review of Basic Statistics • Metrics CS 239 • Why are we talking about statistics? Experimental Methodologies for • Important statistics concepts System Software • Indices of central tendency Peter Reiher • Summarizing variability April 5, 2007 Lecture 2 Lecture 2 Page 1 Page 2 CS 239, Spring 2007 CS 239, Spring 2007 Metrics Common Types of Metrics • Duration/ response time • A metric is a measurable quantity – How long did the simulation run? • For our purposes, one whose value • Processing rate – How many transactions per second? describes an important phenomenon • Resource consumption • Most of performance evaluation is – How much disk is currently used? about properly gathering metrics • Error rates – How often did the system crash? • What metrics can we use to describe security? Lecture 2 Lecture 2 Page 3 Page 4 CS 239, Spring 2007 CS 239, Spring 2007 Some Measures of Response Time Examples of Response Time • Response time: request-response interval • Time from keystroke to echo on screen – Measured from end of request • End-to-end packet delay in networks – Ambiguous: beginning or end of • OS bootstrap time response? • Leaving UCLA to getting on 405 • Reaction time: end of request to start of processing • Turnaround time: start of request to end of response Lecture 2 Lecture 2 Page 5 Page 6 CS 239, Spring 2007 CS 239, Spring 2007 1

  2. Processing Rate Examples of Processing Rate • How much work is done per unit time? • Bank transactions per hour • Important for: • Packets routed per second –Provisioning systems • Web pages crawled per night –Comparing alternative configurations –Multimedia Lecture 2 Lecture 2 Page 7 Page 8 CS 239, Spring 2007 CS 239, Spring 2007 Common Measures Nominal, Knee, and Usable of Processing Rate Capacities • Throughput: requests per unit time: MIPS, MFLOPS, Mb/s, TPS Nominal Capacity • Nominal capacity: theoretical maximum: Delay Response-Time Limit bandwidth Usable Capacity Knee • Knee capacity: where things go bad Knee Cap. • Usable capacity: where response time hits a specified limit • Efficiency: ratio of usable to nominal cap. Load Lecture 2 Lecture 2 Page 9 Page 10 CS 239, Spring 2007 CS 239, Spring 2007 Examples of Resource Resource Consumption Consumption • How much does the work cost? • CPU non-idle time • Used in: • Memory usage –Capacity planning • Fraction of network bandwidth needed –Identifying bottlenecks • How much of your salary is paid for rent • Also helps to identify “next” bottleneck Lecture 2 Lecture 2 Page 11 Page 12 CS 239, Spring 2007 CS 239, Spring 2007 2

  3. Measures of Resource Error Metrics Consumption t ( ) ? • Utilization: u t dt • Failure rates 0 where u ( t ) is instantaneous resource • Probability of failures usage • Time to failure –Useful for memory, disk, etc. • If u ( t ) is always either 1 or 0, reduces to busy time or its inverse, idle time –Useful for network, CPU, etc. Lecture 2 Lecture 2 Page 13 Page 14 CS 239, Spring 2007 CS 239, Spring 2007 Examples of Error Metrics Measures of Errors • Reliability: P(error) or Mean Time Between • Percentage of dropped Internet packets Errors (MTBE) • ATM down time • Availability: – Downtime: Time when system is • Lifetime of a component unavailable, may be measured as Mean • Wrong answers from IRS tax Time to Repair (MTTR) preparation hotline – Uptime: Inverse of downtime, often given as Mean Time Between Failures (MTBF/MTTF) Lecture 2 Lecture 2 Page 15 Page 16 CS 239, Spring 2007 CS 239, Spring 2007 Security Metrics Choosing What to Measure • A difficult problem • Core question in any performance study • Often no good metrics to express security goals and achievements • Pick metrics based on: –Equally bad, some definable metrics –Completeness are impossible to measure –(Non-)redundancy • Some failure metrics are applicable –Variability –Expected time to break a cipher –Feasibility Lecture 2 Lecture 2 Page 17 Page 18 CS 239, Spring 2007 CS 239, Spring 2007 3

  4. Completeness Redundancy • Must cover everything relevant to • Some factors are functions of others problem • Measurements are expensive –Don’t want awkward questions at • Look for minimal set conferences! • Again, often an interactive process • Difficult to guess everything a priori –Often have to add things later Lecture 2 Lecture 2 Page 19 Page 20 CS 239, Spring 2007 CS 239, Spring 2007 Variability Feasibility • Large variance in a measurement makes • Some things are easy to measure decisions impossible • Others are hard • Repeated experiments can reduce variance • A few are impossible – Very expensive • Choose metrics you can actually – Can only reduce it by a certain amount measure • Better to choose low-variance measures to • But beware of the “drunk under the start with streetlamp” phenomenon Lecture 2 Lecture 2 Page 21 Page 22 CS 239, Spring 2007 CS 239, Spring 2007 Variability and Performance An Example Measurements • 10 pings from UCLA to MIT Tuesday night • Performance of a system is often • Each took a different amount of time complex (expressed in msec): –Perhaps not fully explainable 84.0 84.9 84.5 84.3 84.5 • One result is variability in most metric 84.5 84.8 86.8 84.1 84.5 readings • How do we understand what this says about • Good performance measurement takes how long a packet takes to get from LA to this into account Boston? Lecture 2 Lecture 2 Page 23 Page 24 CS 239, Spring 2007 CS 239, Spring 2007 4

  5. How to Get a Handle on Variability? Some Basic Statistics Concepts • If something we’re trying to measure • Independence of events varies from run to run, how do we • Random variables express its behavior? • Cumulative distribution functions • That’s what statistics is all about (CDFs) • Which is why a good performance analyst needs to understand them Lecture 2 Lecture 2 Page 25 Page 26 CS 239, Spring 2007 CS 239, Spring 2007 Independent Events Non-Independent Events • Events are independent if: • Not all events are independent –Occurrence of one event doesn’t • Second person accessing a web page might affect probability of other get it faster than the first • Examples: – Or than someone asking for it the next day –Coin flips • Kids requesting money from their parents –Inputs from separate users – Sooner or later the wallet is empty –“Unrelated” traffic accidents Lecture 2 Lecture 2 Page 27 Page 28 CS 239, Spring 2007 CS 239, Spring 2007 Cumulative Distribution Function Random Variables (CDF) • Variable that takes values probabilistically • Maps a value a of random variable x to – Not necessarily just any value, though probability that the outcome is less than or equal to a: • Variable usually denoted by capital letters, particular values by lowercase ? ? F a x ( ) P x ( a ) • Examples: • Valid for discrete and continuous variables • Monotonically increasing – Number shown on dice • Easy to specify, calculate, measure – Network delay – CS239 attendance Lecture 2 Lecture 2 Page 29 Page 30 CS 239, Spring 2007 CS 239, Spring 2007 5

  6. Probability Density Function CDF Examples (pdf) • Coin flip (T = 1, H = 2): • A “relative” of CDF 1 • Derivative of (continuous) CDF: 0.5 dF x ( ) 0 ? f x ( ) 0 1 2 3 dx • Exponential packet interarrival times: • Useful to find probability of a range: 1 ? ? ? ? 0.5 P x ( x x ) F x ( ) F x ( ) 1 2 2 1 0 ? ? x 2 0 1 2 3 4 f x dx ( ) Lecture 2 Lecture 2 x 1 Page 31 Page 32 CS 239, Spring 2007 CS 239, Spring 2007 Examples of pdf Probability Mass Function (pmf) • PDF doesn’t exist for discrete random • Exponential interarrival times: variables 1 0 –Because their CDF not differentiable 0 1 2 3 • pmf instead: f ( x i ) = p i where p i is the • Gaussian (normal) distribution: probability that x will take on value x i ? ? ? ? 1 P x ( x x ) F x ( ) F x ( ) 1 2 2 1 ? ? p i 0 ? ? 0 1 2 3 x x x Lecture 2 Lecture 2 1 i 2 Page 33 Page 34 CS 239, Spring 2007 CS 239, Spring 2007 Summarizing Data With a Examples of pmf Single Number 1 • Most condensed form of presentation of set • Coin flip: of data 0.5 • Usually called the average 0 0 1 2 3 – Average isn’t necessarily the mean • Typical CS grad class size: • More formal term is index of central tendency 0.5 0.4 • Must be representative of a major part of the 0.3 0.2 data set 0.1 0 4 5 6 7 8 9 10 11 Lecture 2 Lecture 2 Page 35 Page 36 CS 239, Spring 2007 CS 239, Spring 2007 6

Recommend


More recommend