CS 147: Computer Systems Performance Analysis Summarizing - - PowerPoint PPT Presentation

cs 147 computer systems performance analysis
SMART_READER_LITE
LIVE PREVIEW

CS 147: Computer Systems Performance Analysis Summarizing - - PowerPoint PPT Presentation

CS147 2015-06-15 CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining Distributions CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining Distributions 1 / 49 Overview CS147


slide-1
SLIDE 1

CS 147: Computer Systems Performance Analysis

Summarizing Variability and Determining Distributions

1 / 49

CS 147: Computer Systems Performance Analysis

Summarizing Variability and Determining Distributions

2015-06-15

CS147

slide-2
SLIDE 2

Overview

Introduction Indices of Dispersion Range Variance, Standard Deviation, C.V. Quantiles Miscellaneous Measures Choosing a Measure Identifying Distributions Histograms Kernel Density Estimation Quantile-Quantile Plots Statistics of Samples Meaning of a Sample Guessing the True Value

2 / 49

Overview

Introduction Indices of Dispersion Range Variance, Standard Deviation, C.V. Quantiles Miscellaneous Measures Choosing a Measure Identifying Distributions Histograms Kernel Density Estimation Quantile-Quantile Plots Statistics of Samples Meaning of a Sample Guessing the True Value

2015-06-15

CS147 Overview

slide-3
SLIDE 3

Introduction

Summarizing Variability

◮ A single number rarely tells entire story of a data set ◮ Usually, you need to know how much the rest of the data set

varies from that index of central tendency

3 / 49

Summarizing Variability

◮ A single number rarely tells entire story of a data set ◮ Usually, you need to know how much the rest of the data set

varies from that index of central tendency

2015-06-15

CS147 Introduction Summarizing Variability

slide-4
SLIDE 4

Introduction

Why Is Variability Important?

◮ Consider two Web servers:

◮ Server A services all requests in 1 second ◮ Server B services 90% of all requests in .5 seconds ◮ But 10% in 55 seconds ◮ Both have mean service times of 1 second ◮ But which would you prefer to use? 4 / 49

Why Is Variability Important?

◮ Consider two Web servers: ◮ Server A services all requests in 1 second ◮ Server B services 90% of all requests in .5 seconds ◮ But 10% in 55 seconds ◮ Both have mean service times of 1 second ◮ But which would you prefer to use?

2015-06-15

CS147 Introduction Why Is Variability Important?

slide-5
SLIDE 5

Introduction

Indices of Dispersion

◮ Measures of how much a data set varies

◮ Range ◮ Variance and standard deviation ◮ Percentiles ◮ Semi-interquartile range ◮ Mean absolute deviation 5 / 49

Indices of Dispersion

◮ Measures of how much a data set varies ◮ Range ◮ Variance and standard deviation ◮ Percentiles ◮ Semi-interquartile range ◮ Mean absolute deviation

2015-06-15

CS147 Introduction Indices of Dispersion

slide-6
SLIDE 6

Indices of Dispersion Range

Range

◮ Minimum & maximum values in data set ◮ Can be tracked as data values arrive ◮ Variability characterized by difference between minimum and

maximum

◮ Often not useful, due to outliers ◮ Minimum tends to go to zero ◮ Maximum tends to increase over time ◮ Not useful for unbounded variables

6 / 49

Range

◮ Minimum & maximum values in data set ◮ Can be tracked as data values arrive ◮ Variability characterized by difference between minimum and

maximum

◮ Often not useful, due to outliers ◮ Minimum tends to go to zero ◮ Maximum tends to increase over time ◮ Not useful for unbounded variables

2015-06-15

CS147 Indices of Dispersion Range Range

slide-7
SLIDE 7

Indices of Dispersion Range

Example of Range

◮ For data set 2, 5.4, -17, 2056, 445, -4.8, 84.3, 92, 27, -10

◮ Maximum is 2056 ◮ Minimum is -17 ◮ Range is 2073 ◮ While arithmetic mean is 268 7 / 49

Example of Range

◮ For data set 2, 5.4, -17, 2056, 445, -4.8, 84.3, 92, 27, -10 ◮ Maximum is 2056 ◮ Minimum is -17 ◮ Range is 2073 ◮ While arithmetic mean is 268

2015-06-15

CS147 Indices of Dispersion Range Example of Range

slide-8
SLIDE 8

Indices of Dispersion Variance, Standard Deviation, C.V.

Variance (and Its Cousins)

◮ Sample variance is

s2 = 1 n − 1

n

  • i=1

(xi − x)2

◮ Expressed in units of the measured quantity, squared

◮ Which isn’t always easy to understand

◮ Standard deviation and coefficient of variation are derived

from variance

8 / 49

Variance (and Its Cousins)

◮ Sample variance is

s2 = 1 n − 1

n

  • i=1

(xi − x)2

◮ Expressed in units of the measured quantity, squared ◮ Which isn’t always easy to understand ◮ Standard deviation and coefficient of variation are derived

from variance

2015-06-15

CS147 Indices of Dispersion Variance, Standard Deviation, C.V. Variance (and Its Cousins)

slide-9
SLIDE 9

Indices of Dispersion Variance, Standard Deviation, C.V.

Variance Example

◮ For data set 2, 5.4, -17, 2056, 445, -4.8, 84.3, 92, 27, -10 ◮ Variance is 413746.6 ◮ You can see the problem with variance:

◮ Given a mean of 268, what does that variance indicate? 9 / 49

Variance Example

◮ For data set 2, 5.4, -17, 2056, 445, -4.8, 84.3, 92, 27, -10 ◮ Variance is 413746.6 ◮ You can see the problem with variance: ◮ Given a mean of 268, what does that variance indicate?

2015-06-15

CS147 Indices of Dispersion Variance, Standard Deviation, C.V. Variance Example

slide-10
SLIDE 10

Indices of Dispersion Variance, Standard Deviation, C.V.

Standard Deviation

◮ Square root of the variance ◮ In same units as units of metric ◮ So easier to compare to metric

10 / 49

Standard Deviation

◮ Square root of the variance ◮ In same units as units of metric ◮ So easier to compare to metric

2015-06-15

CS147 Indices of Dispersion Variance, Standard Deviation, C.V. Standard Deviation

slide-11
SLIDE 11

Indices of Dispersion Variance, Standard Deviation, C.V.

Standard Deviation Example

◮ For sample set we’ve been using, standard deviation is 643 ◮ Given mean of 268, standard deviation clearly shows lots of

variability from mean

11 / 49

Standard Deviation Example

◮ For sample set we’ve been using, standard deviation is 643 ◮ Given mean of 268, standard deviation clearly shows lots of

variability from mean

2015-06-15

CS147 Indices of Dispersion Variance, Standard Deviation, C.V. Standard Deviation Example

slide-12
SLIDE 12

Indices of Dispersion Variance, Standard Deviation, C.V.

Coefficient of Variation

◮ Ratio of standard deviation to mean ◮ Normalizes units of these quantities into ratio or percentage ◮ Often abbreviated C.O.V. or C.V.

12 / 49

Coefficient of Variation

◮ Ratio of standard deviation to mean ◮ Normalizes units of these quantities into ratio or percentage ◮ Often abbreviated C.O.V. or C.V.

2015-06-15

CS147 Indices of Dispersion Variance, Standard Deviation, C.V. Coefficient of Variation

slide-13
SLIDE 13

Indices of Dispersion Variance, Standard Deviation, C.V.

Coefficient of Variation Example

◮ For sample set we’ve been using, standard deviation is 643 ◮ Mean is 268 ◮ So C.O.V. is 643/268 ≈ 2.4

13 / 49

Coefficient of Variation Example

◮ For sample set we’ve been using, standard deviation is 643 ◮ Mean is 268 ◮ So C.O.V. is 643/268 ≈ 2.4

2015-06-15

CS147 Indices of Dispersion Variance, Standard Deviation, C.V. Coefficient of Variation Example

slide-14
SLIDE 14

Indices of Dispersion Quantiles

Percentiles

◮ Specification of how observations fall into buckets ◮ E.g., 5-percentile is observation that is at the lower 5% of the

set

◮ While 95-percentile is observation at the 95% boundary

◮ Useful even for unbounded variables

14 / 49

Percentiles

◮ Specification of how observations fall into buckets ◮ E.g., 5-percentile is observation that is at the lower 5% of the

set

◮ While 95-percentile is observation at the 95% boundary ◮ Useful even for unbounded variables

2015-06-15

CS147 Indices of Dispersion Quantiles Percentiles

slide-15
SLIDE 15

Indices of Dispersion Quantiles

Relatives of Percentiles

◮ Quantiles - fraction between 0 and 1

◮ Instead of percentage ◮ Also called fractiles

◮ Deciles—percentiles at 10% boundaries

◮ First is 10-percentile, second is 20-percentile, etc.

◮ Quartiles—divide data set into four parts

◮ 25% of sample below first quartile, etc. ◮ Second quartile is also median 15 / 49

Relatives of Percentiles

◮ Quantiles - fraction between 0 and 1 ◮ Instead of percentage ◮ Also called fractiles ◮ Deciles—percentiles at 10% boundaries ◮ First is 10-percentile, second is 20-percentile, etc. ◮ Quartiles—divide data set into four parts ◮ 25% of sample below first quartile, etc. ◮ Second quartile is also median

2015-06-15

CS147 Indices of Dispersion Quantiles Relatives of Percentiles

slide-16
SLIDE 16

Indices of Dispersion Quantiles

Calculating Quantiles

To estimate α-quantile:

◮ First sort the set ◮ Then take [(n − 1)α + 1]th element

◮ 1-indexed ◮ Round to nearest integer index ◮ Exception: for small sets, may be better to choose

“intermediate” value as is done for median

16 / 49

Calculating Quantiles

To estimate α-quantile:

◮ First sort the set ◮ Then take [(n − 1)α + 1]th element ◮ 1-indexed ◮ Round to nearest integer index ◮ Exception: for small sets, may be better to choose “intermediate” value as is done for median

2015-06-15

CS147 Indices of Dispersion Quantiles Calculating Quantiles

slide-17
SLIDE 17

Indices of Dispersion Quantiles

Quartile Example

◮ For data set 2, 5.4, -17, 2056, 445, -4.8, 84.3, 92, 27, -10

(10 observations)

◮ Sort it: -17, -10, -4.8, 2, 5.4, 27, 84.3, 92, 445, 2056 ◮ First quartile, Q1, is -4.8 ◮ Third quartile, Q3, is 92

17 / 49

Quartile Example

◮ For data set 2, 5.4, -17, 2056, 445, -4.8, 84.3, 92, 27, -10

(10 observations)

◮ Sort it: -17, -10, -4.8, 2, 5.4, 27, 84.3, 92, 445, 2056 ◮ First quartile, Q1, is -4.8 ◮ Third quartile, Q3, is 92

2015-06-15

CS147 Indices of Dispersion Quantiles Quartile Example

slide-18
SLIDE 18

Indices of Dispersion Quantiles

Interquartile Range

◮ Yet another measure of dispersion ◮ The difference between Q3 and Q1 ◮ Semi-interquartile range is half that:

SIQR = Q3 − Q1 2

◮ Often interesting measure of what’s going on in middle of

range

◮ Basically indicates distance of quartiles from median 18 / 49

Interquartile Range

◮ Yet another measure of dispersion ◮ The difference between Q3 and Q1 ◮ Semi-interquartile range is half that:

SIQR = Q3 − Q1 2

◮ Often interesting measure of what’s going on in middle of

range

◮ Basically indicates distance of quartiles from median

2015-06-15

CS147 Indices of Dispersion Quantiles Interquartile Range

slide-19
SLIDE 19

Indices of Dispersion Quantiles

Semi-Interquartile Range Example

For data set -17, -10, -4.8, 2, 5.4, 27, 84.3, 92, 445, 2056

◮ Q3 is 92 ◮ Q1 is -4.8

SIQR = Q3 − Q1 2 = 92 − (−4.8) 2 = 48

◮ Compare to standard deviation of 643

◮ Suggests that much of variability is caused by outliers 19 / 49

Semi-Interquartile Range Example

For data set -17, -10, -4.8, 2, 5.4, 27, 84.3, 92, 445, 2056

◮ Q3 is 92 ◮ Q1 is -4.8

SIQR = Q3 − Q1 2 = 92 − (−4.8) 2 = 48

◮ Compare to standard deviation of 643 ◮ Suggests that much of variability is caused by outliers

2015-06-15

CS147 Indices of Dispersion Quantiles Semi-Interquartile Range Example

slide-20
SLIDE 20

Indices of Dispersion Miscellaneous Measures

Mean Absolute Deviation

◮ Yet another measure of variability ◮ Mean absolute deviation = 1

n

n

  • i=1

|xi − x|

◮ Good for hand calculation (doesn’t require multiplication or

square roots)

20 / 49

Mean Absolute Deviation

◮ Yet another measure of variability ◮ Mean absolute deviation = 1

n

n

  • i=1

|xi − x|

◮ Good for hand calculation (doesn’t require multiplication or

square roots)

2015-06-15

CS147 Indices of Dispersion Miscellaneous Measures Mean Absolute Deviation

slide-21
SLIDE 21

Indices of Dispersion Miscellaneous Measures

Mean Absolute Deviation Example

For data set -17, -10, -4.8, 2, 5.4, 27, 84.3, 92, 445, 2056

◮ Mean absolute deviation is

1 10

10

  • i=1

|xi − 268| = 393

21 / 49

Mean Absolute Deviation Example

For data set -17, -10, -4.8, 2, 5.4, 27, 84.3, 92, 445, 2056

◮ Mean absolute deviation is

1 10

10

  • i=1

|xi − 268| = 393

2015-06-15

CS147 Indices of Dispersion Miscellaneous Measures Mean Absolute Deviation Example

slide-22
SLIDE 22

Indices of Dispersion Choosing a Measure

Sensitivity To Outliers

◮ From most to least, ◮ Range ◮ Variance ◮ Mean absolute deviation ◮ Semi-interquartile range

22 / 49

Sensitivity To Outliers

◮ From most to least, ◮ Range ◮ Variance ◮ Mean absolute deviation ◮ Semi-interquartile range

2015-06-15

CS147 Indices of Dispersion Choosing a Measure Sensitivity To Outliers

slide-23
SLIDE 23

Indices of Dispersion Choosing a Measure

So, Which Index of Dispersion Should I Use?

Bounded? Unimodal Symmetrical? Percentiles or SIQR Range C.O.V. Yes Yes No No

But always remember what you’re looking for

23 / 49

So, Which Index of Dispersion Should I Use?

Bounded? Unimodal Symmetrical? Percentiles or SIQR Range C.O.V. Yes Yes No No

But always remember what you’re looking for

2015-06-15

CS147 Indices of Dispersion Choosing a Measure So, Which Index of Dispersion Should I Use?

slide-24
SLIDE 24

Identifying Distributions

Finding a Distribution for Datasets

◮ If a data set has a common distribution, that’s the best way to

summarize it

◮ Saying a data set is uniformly distributed is more informative

than just giving mean and standard deviation

◮ So how do you determine if your data set fits a distribution?

24 / 49

Finding a Distribution for Datasets

◮ If a data set has a common distribution, that’s the best way to

summarize it

◮ Saying a data set is uniformly distributed is more informative

than just giving mean and standard deviation

◮ So how do you determine if your data set fits a distribution?

2015-06-15

CS147 Identifying Distributions Finding a Distribution for Datasets

slide-25
SLIDE 25

Identifying Distributions

Methods of Determining a Distribution

◮ Plot a histogram ◮ Kernel density estimation ◮ Quantile-quantile plot ◮ Statistical methods (not covered in this class)

25 / 49

Methods of Determining a Distribution

◮ Plot a histogram ◮ Kernel density estimation ◮ Quantile-quantile plot ◮ Statistical methods (not covered in this class)

2015-06-15

CS147 Identifying Distributions Methods of Determining a Distribution

slide-26
SLIDE 26

Identifying Distributions Histograms

Plotting a Histogram

Suitable if you have relatively large number of data points Procedure:

  • 1. Determine range of observations
  • 2. Divide range into buckets
  • 3. Count number of observations in each bucket
  • 4. Divide by total number of observations and plot as column

chart

26 / 49

Plotting a Histogram

Suitable if you have relatively large number of data points Procedure:

  • 1. Determine range of observations
  • 2. Divide range into buckets
  • 3. Count number of observations in each bucket
  • 4. Divide by total number of observations and plot as column

chart

2015-06-15

CS147 Identifying Distributions Histograms Plotting a Histogram

slide-27
SLIDE 27

Identifying Distributions Histograms

Problems With Histogram Approach

◮ Determining cell size

◮ If too small, too few observations per cell ◮ If too large, no useful details in plot

◮ If fewer than five observations in a cell, cell size is too small

27 / 49

Problems With Histogram Approach

◮ Determining cell size ◮ If too small, too few observations per cell ◮ If too large, no useful details in plot ◮ If fewer than five observations in a cell, cell size is too small

2015-06-15

CS147 Identifying Distributions Histograms Problems With Histogram Approach

slide-28
SLIDE 28

Identifying Distributions Kernel Density Estimation

Kernel Density Estimation

◮ Basic idea: any observation represents probability of high

near near that observation

◮ Example:

◮ Seeing 7 means pdf is high all around 7 ◮ Seeing 6.5 also means pdf is high near 7

◮ “Average out” observations to get smooth histogram

28 / 49

Kernel Density Estimation

◮ Basic idea: any observation represents probability of high

near near that observation

◮ Example: ◮ Seeing 7 means pdf is high all around 7 ◮ Seeing 6.5 also means pdf is high near 7 ◮ “Average out” observations to get smooth histogram

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation Kernel Density Estimation

slide-29
SLIDE 29

Identifying Distributions Kernel Density Estimation

KDE Equations

◮ Want to estimate continuous p(x):

ˆ p(x) = 1 nh

n

  • i=1

K x − xi h

  • ◮ Where K(x) is kernel function

◮ Must integrate to unity:

−∞ K(x) dx = 1 ◮ Purpose is to select nearby samples ◮ h is bandwidth parameter ◮ Controls how many nearby samples selected ◮ Large bandwidth ⇒ more smoothing, less detail

29 / 49

KDE Equations

◮ Want to estimate continuous p(x):

ˆ p(x) = 1 nh

n

  • i=1

K x − xi h

  • ◮ Where K(x) is kernel function

◮ Must integrate to unity:

−∞ K(x) dx = 1 ◮ Purpose is to select nearby samples ◮ h is bandwidth parameter ◮ Controls how many nearby samples selected ◮ Large bandwidth ⇒ more smoothing, less detail

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Equations

slide-30
SLIDE 30

Identifying Distributions Kernel Density Estimation

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

30 / 49

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Rectangular)

slide-31
SLIDE 31

Identifying Distributions Kernel Density Estimation

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

30 / 49

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Rectangular)

slide-32
SLIDE 32

Identifying Distributions Kernel Density Estimation

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

30 / 49

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Rectangular)

slide-33
SLIDE 33

Identifying Distributions Kernel Density Estimation

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

30 / 49

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Rectangular)

slide-34
SLIDE 34

Identifying Distributions Kernel Density Estimation

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

30 / 49

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Rectangular)

slide-35
SLIDE 35

Identifying Distributions Kernel Density Estimation

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

30 / 49

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Rectangular)

slide-36
SLIDE 36

Identifying Distributions Kernel Density Estimation

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

30 / 49

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Rectangular)

slide-37
SLIDE 37

Identifying Distributions Kernel Density Estimation

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

30 / 49

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Rectangular)

slide-38
SLIDE 38

Identifying Distributions Kernel Density Estimation

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

30 / 49

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Rectangular)

slide-39
SLIDE 39

Identifying Distributions Kernel Density Estimation

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

30 / 49

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Rectangular)

slide-40
SLIDE 40

Identifying Distributions Kernel Density Estimation

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

30 / 49

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Rectangular)

slide-41
SLIDE 41

Identifying Distributions Kernel Density Estimation

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

30 / 49

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Rectangular)

slide-42
SLIDE 42

Identifying Distributions Kernel Density Estimation

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

30 / 49

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Rectangular)

slide-43
SLIDE 43

Identifying Distributions Kernel Density Estimation

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

30 / 49

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Rectangular)

slide-44
SLIDE 44

Identifying Distributions Kernel Density Estimation

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

30 / 49

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Rectangular)

slide-45
SLIDE 45

Identifying Distributions Kernel Density Estimation

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

30 / 49

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Rectangular)

slide-46
SLIDE 46

Identifying Distributions Kernel Density Estimation

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

30 / 49

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Rectangular)

slide-47
SLIDE 47

Identifying Distributions Kernel Density Estimation

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

30 / 49

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Rectangular)

slide-48
SLIDE 48

Identifying Distributions Kernel Density Estimation

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

30 / 49

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Rectangular)

slide-49
SLIDE 49

Identifying Distributions Kernel Density Estimation

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

30 / 49

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Rectangular)

slide-50
SLIDE 50

Identifying Distributions Kernel Density Estimation

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

30 / 49

KDE Intuition (Rectangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Rectangular)

slide-51
SLIDE 51

Identifying Distributions Kernel Density Estimation

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

31 / 49

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Triangular)

slide-52
SLIDE 52

Identifying Distributions Kernel Density Estimation

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

31 / 49

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Triangular)

slide-53
SLIDE 53

Identifying Distributions Kernel Density Estimation

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

31 / 49

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Triangular)

slide-54
SLIDE 54

Identifying Distributions Kernel Density Estimation

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

31 / 49

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Triangular)

slide-55
SLIDE 55

Identifying Distributions Kernel Density Estimation

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

31 / 49

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Triangular)

slide-56
SLIDE 56

Identifying Distributions Kernel Density Estimation

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

31 / 49

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Triangular)

slide-57
SLIDE 57

Identifying Distributions Kernel Density Estimation

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

31 / 49

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Triangular)

slide-58
SLIDE 58

Identifying Distributions Kernel Density Estimation

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

31 / 49

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Triangular)

slide-59
SLIDE 59

Identifying Distributions Kernel Density Estimation

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

31 / 49

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Triangular)

slide-60
SLIDE 60

Identifying Distributions Kernel Density Estimation

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

31 / 49

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Triangular)

slide-61
SLIDE 61

Identifying Distributions Kernel Density Estimation

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

31 / 49

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Triangular)

slide-62
SLIDE 62

Identifying Distributions Kernel Density Estimation

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

31 / 49

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Triangular)

slide-63
SLIDE 63

Identifying Distributions Kernel Density Estimation

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

31 / 49

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Triangular)

slide-64
SLIDE 64

Identifying Distributions Kernel Density Estimation

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

31 / 49

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Triangular)

slide-65
SLIDE 65

Identifying Distributions Kernel Density Estimation

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

31 / 49

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Triangular)

slide-66
SLIDE 66

Identifying Distributions Kernel Density Estimation

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

31 / 49

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Triangular)

slide-67
SLIDE 67

Identifying Distributions Kernel Density Estimation

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

31 / 49

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Triangular)

slide-68
SLIDE 68

Identifying Distributions Kernel Density Estimation

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

31 / 49

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Triangular)

slide-69
SLIDE 69

Identifying Distributions Kernel Density Estimation

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

31 / 49

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Triangular)

slide-70
SLIDE 70

Identifying Distributions Kernel Density Estimation

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

31 / 49

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Triangular)

slide-71
SLIDE 71

Identifying Distributions Kernel Density Estimation

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

31 / 49

KDE Intuition (Triangular)

5 10 15 20 5 10 15 20 25 5 10 15 20 5 10 15 20 25

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Intuition (Triangular)

slide-72
SLIDE 72

Identifying Distributions Kernel Density Estimation

KDE Example

◮ Sample data set: -17, -10, -4.8, 2, 5.4, 27, 84.3, 92, 445, 2056 ◮ One observation per sample ◮ KDE with Gaussian window (RHS dropped):

  • 50

50 100 150 0.00 0.01 0.02 0.03

p(x)

32 / 49

KDE Example

◮ Sample data set: -17, -10, -4.8, 2, 5.4, 27, 84.3, 92, 445, 2056 ◮ One observation per sample ◮ KDE with Gaussian window (RHS dropped):

  • 50

50 100 150 0.00 0.01 0.02 0.03 p(x)

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Example

slide-73
SLIDE 73

Identifying Distributions Kernel Density Estimation

KDE Example #2

◮ Same data set ◮ Narrower Gaussian window ◮ (Again, RHS dropped):

  • 50

50 100 150 0.00 0.01 0.02 0.03

p(x)

33 / 49

KDE Example #2

◮ Same data set ◮ Narrower Gaussian window ◮ (Again, RHS dropped):

  • 50

50 100 150 0.00 0.01 0.02 0.03 p(x)

2015-06-15

CS147 Identifying Distributions Kernel Density Estimation KDE Example #2

slide-74
SLIDE 74

Identifying Distributions Quantile-Quantile Plots

Quantile-Quantile Plots

◮ More suitable than KDE for small data sets ◮ Basically, guess a distribution ◮ Plot where quantiles of data should fall in that distribution

◮ Against where they actually fall

◮ If plot is close to linear, data closely matches guessed

distribution

34 / 49

Quantile-Quantile Plots

◮ More suitable than KDE for small data sets ◮ Basically, guess a distribution ◮ Plot where quantiles of data should fall in that distribution ◮ Against where they actually fall ◮ If plot is close to linear, data closely matches guessed

distribution

2015-06-15

CS147 Identifying Distributions Quantile-Quantile Plots Quantile-Quantile Plots

slide-75
SLIDE 75

Identifying Distributions Quantile-Quantile Plots

Obtaining Theoretical Quantiles

◮ Need to determine where quantiles should fall for a particular

distribution

◮ Requires inverting CDF for that distribution

◮ Then determining quantiles for observed points ◮ Then plugging quantiles into inverted CDF 35 / 49

Obtaining Theoretical Quantiles

◮ Need to determine where quantiles should fall for a particular

distribution

◮ Requires inverting CDF for that distribution ◮ Then determining quantiles for observed points ◮ Then plugging quantiles into inverted CDF

2015-06-15

CS147 Identifying Distributions Quantile-Quantile Plots Obtaining Theoretical Quantiles

slide-76
SLIDE 76

Identifying Distributions Quantile-Quantile Plots

Inverting a Distribution

◮ Many common distributions have already been inverted (how

convenient...)

◮ For others that are hard to invert, tables and approximations

  • ften available (nearly as convenient)

36 / 49

Inverting a Distribution

◮ Many common distributions have already been inverted (how

convenient...)

◮ For others that are hard to invert, tables and approximations

  • ften available (nearly as convenient)

2015-06-15

CS147 Identifying Distributions Quantile-Quantile Plots Inverting a Distribution

slide-77
SLIDE 77

Identifying Distributions Quantile-Quantile Plots

Is Our Sample Data Set Normally Distributed?

◮ Our data set was -17, -10, -4.8, 2, 5.4, 27, 84.3, 92, 445, 2056 ◮ Does this match normal distribution? ◮ Normal distribution doesn’t invert nicely

◮ But there is an approximation:

xi = 4.91

  • q 0.14

i

− (1 − qi)0.14

◮ Or invert numerically 37 / 49

Is Our Sample Data Set Normally Distributed?

◮ Our data set was -17, -10, -4.8, 2, 5.4, 27, 84.3, 92, 445, 2056 ◮ Does this match normal distribution? ◮ Normal distribution doesn’t invert nicely ◮ But there is an approximation: xi = 4.91

  • q 0.14

i

− (1 − qi)0.14

◮ Or invert numerically

2015-06-15

CS147 Identifying Distributions Quantile-Quantile Plots Is Our Sample Data Set Normally Distributed?

slide-78
SLIDE 78

Identifying Distributions Quantile-Quantile Plots

Data For Example Normal Quantile-Quantile Plot

i qi yi xi 1 0.05

  • 17.0
  • 1.64684

2 0.15

  • 10.0
  • 1.03481

3 0.25

  • 4.8
  • 0.67234

4 0.35 2.0

  • 0.38375

5 0.45 5.4

  • 0.12510

6 0.55 27.0 0.12510 7 0.65 84.3 0.38375 8 0.75 92.0 0.67234 9 0.85 445.0 1.03481 10 0.95 2056.0 1.64684

38 / 49

Data For Example Normal Quantile-Quantile Plot

i qi yi xi 1 0.05

  • 17.0
  • 1.64684

2 0.15

  • 10.0
  • 1.03481

3 0.25

  • 4.8
  • 0.67234

4 0.35 2.0

  • 0.38375

5 0.45 5.4

  • 0.12510

6 0.55 27.0 0.12510 7 0.65 84.3 0.38375 8 0.75 92.0 0.67234 9 0.85 445.0 1.03481 10 0.95 2056.0 1.64684

2015-06-15

CS147 Identifying Distributions Quantile-Quantile Plots Data For Example Normal Quantile-Quantile Plot

slide-79
SLIDE 79

Identifying Distributions Quantile-Quantile Plots

Example Normal Quantile-Quantile Plot

  • 1

1

  • 500

500 1000 1500 2000 2500

39 / 49

Example Normal Quantile-Quantile Plot

  • 1

1

  • 500

500 1000 1500 2000 2500

2015-06-15

CS147 Identifying Distributions Quantile-Quantile Plots Example Normal Quantile-Quantile Plot

slide-80
SLIDE 80

Identifying Distributions Quantile-Quantile Plots

Analysis

◮ Definitely not normal

◮ Because it isn’t linear ◮ Tail at high end is too long for normal

◮ But perhaps the lower part of graph is normal?

40 / 49

Analysis

◮ Definitely not normal ◮ Because it isn’t linear ◮ Tail at high end is too long for normal ◮ But perhaps the lower part of graph is normal?

2015-06-15

CS147 Identifying Distributions Quantile-Quantile Plots Analysis

slide-81
SLIDE 81

Identifying Distributions Quantile-Quantile Plots

Quantile-Quantile Plot of Partial Data

  • 1

50 100

41 / 49

Quantile-Quantile Plot of Partial Data

  • 1

50 100

2015-06-15

CS147 Identifying Distributions Quantile-Quantile Plots Quantile-Quantile Plot of Partial Data

slide-82
SLIDE 82

Identifying Distributions Quantile-Quantile Plots

Analysis of Partial Data Plot

◮ Again, at highest points it doesn’t fit normal distribution ◮ But at lower points it fits somewhat well ◮ So, again, this distribution looks like normal with longer tail to

right

42 / 49

Analysis of Partial Data Plot

◮ Again, at highest points it doesn’t fit normal distribution ◮ But at lower points it fits somewhat well ◮ So, again, this distribution looks like normal with longer tail to

right

2015-06-15

CS147 Identifying Distributions Quantile-Quantile Plots Analysis of Partial Data Plot

slide-83
SLIDE 83

Identifying Distributions Quantile-Quantile Plots

Analysis of Partial Data Plot

◮ Again, at highest points it doesn’t fit normal distribution ◮ But at lower points it fits somewhat well ◮ So, again, this distribution looks like normal with longer tail to

right

◮ (Really need more data points)

42 / 49

Analysis of Partial Data Plot

◮ Again, at highest points it doesn’t fit normal distribution ◮ But at lower points it fits somewhat well ◮ So, again, this distribution looks like normal with longer tail to

right

◮ (Really need more data points)

2015-06-15

CS147 Identifying Distributions Quantile-Quantile Plots Analysis of Partial Data Plot

slide-84
SLIDE 84

Identifying Distributions Quantile-Quantile Plots

Analysis of Partial Data Plot

◮ Again, at highest points it doesn’t fit normal distribution ◮ But at lower points it fits somewhat well ◮ So, again, this distribution looks like normal with longer tail to

right

◮ (Really need more data points) ◮ You can keep this up for a good, long time

42 / 49

Analysis of Partial Data Plot

◮ Again, at highest points it doesn’t fit normal distribution ◮ But at lower points it fits somewhat well ◮ So, again, this distribution looks like normal with longer tail to

right

◮ (Really need more data points) ◮ You can keep this up for a good, long time

2015-06-15

CS147 Identifying Distributions Quantile-Quantile Plots Analysis of Partial Data Plot

slide-85
SLIDE 85

Identifying Distributions Quantile-Quantile Plots

Interpreting Quantile-Quantile Plots

Mnemonic: Q-Q plot shaped like “S” has Short tails;

  • pposite has long ones.

43 / 49

Interpreting Quantile-Quantile Plots

Mnemonic: Q-Q plot shaped like “S” has Short tails;

  • pposite has long ones.

2015-06-15

CS147 Identifying Distributions Quantile-Quantile Plots Interpreting Quantile-Quantile Plots

slide-86
SLIDE 86

Statistics of Samples Meaning of a Sample

What is a Sample?

◮ How tall is a human?

◮ Could measure every person in the world ◮ Or could measure everyone in this room

◮ Population has parameters

◮ Real and meaningful

◮ Sample has statistics

◮ Drawn from population ◮ Inherently erroneous 44 / 49

What is a Sample?

◮ How tall is a human? ◮ Could measure every person in the world ◮ Or could measure everyone in this room ◮ Population has parameters ◮ Real and meaningful ◮ Sample has statistics ◮ Drawn from population ◮ Inherently erroneous

2015-06-15

CS147 Statistics of Samples Meaning of a Sample What is a Sample?

slide-87
SLIDE 87

Statistics of Samples Meaning of a Sample

Sample Statistics

◮ How tall is a human?

◮ People in B126 have a mean height ◮ People in Edwards have a different mean

◮ Sample mean is itself a random variable

◮ Has own distribution 45 / 49

Sample Statistics

◮ How tall is a human? ◮ People in B126 have a mean height ◮ People in Edwards have a different mean ◮ Sample mean is itself a random variable ◮ Has own distribution

2015-06-15

CS147 Statistics of Samples Meaning of a Sample Sample Statistics

slide-88
SLIDE 88

Statistics of Samples Meaning of a Sample

Estimating Population from Samples

◮ How tall is a human?

◮ Measure everybody in this room ◮ Calculate sample mean x ◮ Assume population mean µ equals x

◮ What is the error in our estimate?

46 / 49

Estimating Population from Samples

◮ How tall is a human? ◮ Measure everybody in this room ◮ Calculate sample mean x ◮ Assume population mean µ equals x ◮ What is the error in our estimate?

2015-06-15

CS147 Statistics of Samples Meaning of a Sample Estimating Population from Samples

slide-89
SLIDE 89

Statistics of Samples Meaning of a Sample

Estimating Error

◮ Sample mean is a random variable

⇒ Mean has some distribution ∴ Multiple sample means have “mean of means”

◮ Knowing distribution of means, we can estimate error

47 / 49

Estimating Error

◮ Sample mean is a random variable ⇒ Mean has some distribution ∴ Multiple sample means have “mean of means” ◮ Knowing distribution of means, we can estimate error

2015-06-15

CS147 Statistics of Samples Meaning of a Sample Estimating Error

slide-90
SLIDE 90

Statistics of Samples Guessing the True Value

Estimating the Value of a Random Variable

◮ How tall is Fred?

48 / 49

Estimating the Value of a Random Variable

◮ How tall is Fred?

2015-06-15

CS147 Statistics of Samples Guessing the True Value Estimating the Value of a Random Variable

slide-91
SLIDE 91

Statistics of Samples Guessing the True Value

Estimating the Value of a Random Variable

◮ How tall is Fred? ◮ Suppose average human height is 170 cm

48 / 49

Estimating the Value of a Random Variable

◮ How tall is Fred? ◮ Suppose average human height is 170 cm

2015-06-15

CS147 Statistics of Samples Guessing the True Value Estimating the Value of a Random Variable

slide-92
SLIDE 92

Statistics of Samples Guessing the True Value

Estimating the Value of a Random Variable

◮ How tall is Fred? ◮ Suppose average human height is 170 cm

∴ Fred is 170 cm tall

48 / 49

Estimating the Value of a Random Variable

◮ How tall is Fred? ◮ Suppose average human height is 170 cm ∴ Fred is 170 cm tall

2015-06-15

CS147 Statistics of Samples Guessing the True Value Estimating the Value of a Random Variable

slide-93
SLIDE 93

Statistics of Samples Guessing the True Value

Estimating the Value of a Random Variable

◮ How tall is Fred? ◮ Suppose average human height is 170 cm

∴ Fred is 170 cm tall

◮ Yeah, right 48 / 49

Estimating the Value of a Random Variable

◮ How tall is Fred? ◮ Suppose average human height is 170 cm ∴ Fred is 170 cm tall ◮ Yeah, right

2015-06-15

CS147 Statistics of Samples Guessing the True Value Estimating the Value of a Random Variable

slide-94
SLIDE 94

Statistics of Samples Guessing the True Value

Estimating the Value of a Random Variable

◮ How tall is Fred? ◮ Suppose average human height is 170 cm

∴ Fred is 170 cm tall

◮ Yeah, right

◮ Safer to assume a range

48 / 49

Estimating the Value of a Random Variable

◮ How tall is Fred? ◮ Suppose average human height is 170 cm ∴ Fred is 170 cm tall ◮ Yeah, right ◮ Safer to assume a range

2015-06-15

CS147 Statistics of Samples Guessing the True Value Estimating the Value of a Random Variable

slide-95
SLIDE 95

Statistics of Samples Guessing the True Value

Confidence Intervals

◮ How tall is Fred?

49 / 49

Confidence Intervals

◮ How tall is Fred?

2015-06-15

CS147 Statistics of Samples Guessing the True Value Confidence Intervals

slide-96
SLIDE 96

Statistics of Samples Guessing the True Value

Confidence Intervals

◮ How tall is Fred?

◮ Suppose 90% of humans are between 155 and 190 cm 49 / 49

Confidence Intervals

◮ How tall is Fred? ◮ Suppose 90% of humans are between 155 and 190 cm

2015-06-15

CS147 Statistics of Samples Guessing the True Value Confidence Intervals

slide-97
SLIDE 97

Statistics of Samples Guessing the True Value

Confidence Intervals

◮ How tall is Fred?

◮ Suppose 90% of humans are between 155 and 190 cm

∴ Fred is between 155 and 190 cm

49 / 49

Confidence Intervals

◮ How tall is Fred? ◮ Suppose 90% of humans are between 155 and 190 cm ∴ Fred is between 155 and 190 cm

2015-06-15

CS147 Statistics of Samples Guessing the True Value Confidence Intervals

slide-98
SLIDE 98

Statistics of Samples Guessing the True Value

Confidence Intervals

◮ How tall is Fred?

◮ Suppose 90% of humans are between 155 and 190 cm

∴ Fred is between 155 and 190 cm

◮ We are 90% confident that Fred is between 155 and 190 cm

49 / 49

Confidence Intervals

◮ How tall is Fred? ◮ Suppose 90% of humans are between 155 and 190 cm ∴ Fred is between 155 and 190 cm ◮ We are 90% confident that Fred is between 155 and 190 cm

2015-06-15

CS147 Statistics of Samples Guessing the True Value Confidence Intervals