CS147 2015-06-15 CS 147: Computer Systems Performance Analysis Summarizing Data CS 147: Computer Systems Performance Analysis Summarizing Data 1 / 30
Overview CS147 Overview 2015-06-15 “Standard” Indices of Central Tendency Definitions Characteristics Selecting an Index Other Indices Geometric Mean Harmonic Mean Overview Dealing with Ratios Case 1: Two Physical Meanings “Standard” Indices of Central Tendency Case 1a: Constant Denominator Case 1b: Constant Numerator Case 2: Multiplicative Relationship Definitions Characteristics Selecting an Index Other Indices Geometric Mean Harmonic Mean Dealing with Ratios Case 1: Two Physical Meanings Case 1a: Constant Denominator Case 1b: Constant Numerator Case 2: Multiplicative Relationship 2 / 30
Summarizing Data With a Single Number CS147 Summarizing Data With a Single Number 2015-06-15 ◮ Most condensed form of presentation of set of data ◮ Usually called the average ◮ Average isn’t necessarily the mean ◮ Must be representative of a major part of the data set Summarizing Data With a Single Number ◮ Most condensed form of presentation of set of data ◮ Usually called the average ◮ Average isn’t necessarily the mean ◮ Must be representative of a major part of the data set 3 / 30
“Standard” Indices of Central Tendency Indices of Central Tendency CS147 Indices of Central Tendency 2015-06-15 “Standard” Indices of Central Tendency ◮ Mean ◮ Median ◮ Mode ◮ All specify center of location of distribution of observations in Indices of Central Tendency sample ◮ Mean ◮ Median ◮ Mode ◮ All specify center of location of distribution of observations in sample 4 / 30
“Standard” Indices of Central Tendency Definitions Sample Mean CS147 Sample Mean 2015-06-15 “Standard” Indices of Central Tendency ◮ Take sum of all observations ◮ Divide by number of observations Definitions ◮ More affected by outliers than median or mode ◮ Mean is a linear property ◮ Mean of sum is sum of means Sample Mean ◮ Not true for median and mode ◮ Take sum of all observations ◮ Divide by number of observations ◮ More affected by outliers than median or mode ◮ Mean is a linear property ◮ Mean of sum is sum of means ◮ Not true for median and mode 5 / 30
“Standard” Indices of Central Tendency Definitions Sample Median CS147 Sample Median 2015-06-15 “Standard” Indices of Central Tendency ◮ Sort observations Definitions ◮ Take observation in middle of series ◮ If even number, split the difference ◮ More resistant to outliers ◮ But not all points given “equal weight” Sample Median ◮ Sort observations ◮ Take observation in middle of series ◮ If even number, split the difference ◮ More resistant to outliers ◮ But not all points given “equal weight” 6 / 30
“Standard” Indices of Central Tendency Definitions Sample Mode CS147 Sample Mode 2015-06-15 “Standard” Indices of Central Tendency ◮ Plot histogram of observations ◮ Using existing categories Definitions ◮ Or dividing ranges into buckets ◮ Or using kernel density estimation ◮ Choose midpoint of bucket where histogram peaks ◮ For categorical variables, the most frequently occurring Sample Mode ◮ Effectively ignores much of the sample ◮ Plot histogram of observations ◮ Using existing categories ◮ Or dividing ranges into buckets ◮ Or using kernel density estimation ◮ Choose midpoint of bucket where histogram peaks ◮ For categorical variables, the most frequently occurring ◮ Effectively ignores much of the sample 7 / 30
“Standard” Indices of Central Tendency Characteristics Characteristics of Mean, Median, and Mode CS147 Characteristics of Mean, Median, and Mode 2015-06-15 “Standard” Indices of Central Tendency ◮ Mean and median always exist and are unique ◮ Mode may or may not exist Characteristics ◮ If there is a mode, may be more than one ◮ Mean, median and mode may be identical ◮ Or may all be different Characteristics of Mean, Median, and Mode ◮ Or some may be the same ◮ Mean and median always exist and are unique ◮ Mode may or may not exist ◮ If there is a mode, may be more than one ◮ Mean, median and mode may be identical ◮ Or may all be different ◮ Or some may be the same 8 / 30
“Standard” Indices of Central Tendency Characteristics Mean, Median, and Mode Identical CS147 Mean, Median, and Mode Identical 2015-06-15 Median “Standard” Indices of Central Tendency Mean Mode Characteristics pdf f(x) Mean, Median, and Mode Identical Median x Mean Mode pdf f(x) x 9 / 30
“Standard” Indices of Central Tendency Characteristics Median, Mean, and Mode All Different CS147 Median, Mean, and Mode All Different 2015-06-15 “Standard” Indices of Central Tendency Mean Median Mode Characteristics pdf f(x) Median, Mean, and Mode All Different x Mean Median Mode pdf f(x) x 10 / 30
“Standard” Indices of Central Tendency Selecting an Index So, Which Should I Use? CS147 So, Which Should I Use? 2015-06-15 “Standard” Indices of Central Tendency ◮ Depends on characteristics of the metric ◮ If data is categorical, use mode Selecting an Index ◮ If a total of all observations makes sense, use mean ◮ If not (e.g., ratios), and distribution is skewed, use median ◮ Otherwise, use mean So, Which Should I Use? . . . but think about what you’re choosing ◮ Depends on characteristics of the metric ◮ If data is categorical, use mode ◮ If a total of all observations makes sense, use mean ◮ If not (e.g., ratios), and distribution is skewed, use median ◮ Otherwise, use mean . . . but think about what you’re choosing 11 / 30
“Standard” Indices of Central Tendency Selecting an Index Some Examples CS147 Some Examples 2015-06-15 “Standard” Indices of Central Tendency ◮ Most-used resource in system Selecting an Index Some Examples ◮ Most-used resource in system 12 / 30
“Standard” Indices of Central Tendency Selecting an Index Some Examples CS147 Some Examples 2015-06-15 “Standard” Indices of Central Tendency ◮ Most-used resource in system ◮ Mode Selecting an Index ◮ Interarrival times Some Examples ◮ Most-used resource in system ◮ Mode ◮ Interarrival times 12 / 30
“Standard” Indices of Central Tendency Selecting an Index Some Examples CS147 Some Examples 2015-06-15 “Standard” Indices of Central Tendency ◮ Most-used resource in system ◮ Mode Selecting an Index ◮ Interarrival times ◮ Mean ◮ Load Some Examples ◮ Most-used resource in system ◮ Mode ◮ Interarrival times ◮ Mean ◮ Load 12 / 30
“Standard” Indices of Central Tendency Selecting an Index Some Examples CS147 Some Examples 2015-06-15 “Standard” Indices of Central Tendency ◮ Most-used resource in system ◮ Mode Selecting an Index ◮ Interarrival times ◮ Mean ◮ Load Some Examples ◮ Median ◮ Most-used resource in system ◮ Mode ◮ Interarrival times ◮ Mean ◮ Load ◮ Median 12 / 30
“Standard” Indices of Central Tendency Selecting an Index Don’t Always Use the Mean CS147 Don’t Always Use the Mean 2015-06-15 “Standard” Indices of Central Tendency ◮ Means are often overused and misused ◮ Means of significantly different values Selecting an Index ◮ Means of highly skewed distributions ◮ Multiplying means to get mean of a product ◮ Only works for independent variables ◮ Errors in taking ratios of means Don’t Always Use the Mean ◮ Means of categorical variables ◮ Means are often overused and misused ◮ Means of significantly different values ◮ Means of highly skewed distributions ◮ Multiplying means to get mean of a product ◮ Only works for independent variables ◮ Errors in taking ratios of means ◮ Means of categorical variables 13 / 30
Other Indices Geometric Mean Geometric Means CS147 Geometric Means 2015-06-15 Other Indices ◮ An alternative to the arithmetic mean Geometric Mean � n � 1 / n x = ˙ � x i i = 1 Geometric Means ◮ Use geometric mean if product of observations makes sense ◮ An alternative to the arithmetic mean � n � 1 / n � ˙ x = x i i = 1 ◮ Use geometric mean if product of observations makes sense 14 / 30
Other Indices Geometric Mean Good Places To Use Geometric Mean CS147 Good Places To Use Geometric Mean 2015-06-15 Other Indices ◮ Layered architectures Geometric Mean ◮ Performance improvements over successive versions ◮ Average error rate on multihop network path ◮ Year-to-year interest rates Good Places To Use Geometric Mean ◮ Layered architectures ◮ Performance improvements over successive versions ◮ Average error rate on multihop network path ◮ Year-to-year interest rates 15 / 30
Other Indices Harmonic Mean Harmonic Mean CS147 Harmonic Mean 2015-06-15 Other Indices ◮ Harmonic mean of sample { x 1 , x 2 , . . . , x n } is Harmonic Mean n ¨ x = 1 / x 1 + 1 / x 2 + · · · + 1 / x n Harmonic Mean ◮ Use when arithmetic mean of 1 / x i is sensible ◮ Harmonic mean of sample { x 1 , x 2 , . . . , x n } is n ¨ x = 1 / x 1 + 1 / x 2 + · · · + 1 / x n ◮ Use when arithmetic mean of 1 / x i is sensible 16 / 30
Recommend
More recommend