Foundations of Computing II Lecture 24: Biased Estimation Stefano - PowerPoint PPT Presentation

CSE 312 Foundations of Computing II Lecture 24: Biased Estimation Stefano Tessaro tessaro@cs.washington.edu 1

Parameter Estimation – Workflow Parameter estimate Independent Distribution + % Algorithm samples # ' , … , # * ℙ(#|%) from ℙ(#|%) % = unknown parameter Maximum Likelihood Estimation (MLE). Given data # ' , … . , # * , find % = + + %(# ' , … , # * ) (“the MLE”) such that . # ' , … . , # * + % is maximized! 2

Likelihood – Continuous Case Definition. The likelihood of independent observations # ' , … . , # * is * . # ' , … . , # * % = / 2(# 0 |%) 01' 3

Example – Gaussian Parameters Normal outcomes # ' , … , # * , known variance 3 4 = 1 Goal: MLE for 6 = expectation * * 9 : ; < := > 1 * 9 : ; < := > 1 . # ' , … . , # * 6 = / 4 = / 4 28 28 01' 01' * # 0 − 6 4 ln . # ' , … . , # * 6 = − B ln 28 − C 2 2 01' 4

Goal: estimate 6 = expectation Example – Gaussian Parameters * # 0 − 6 4 ln . # ' , … . , # * 6 = − B ln 28 − C 2 2 01' ; < := > F ' = 4 ⋅ 2 ⋅ # 0 − 6 ⋅ −1 = 6 − # 0 Note: F= 4 * * D D6 ln . # ' , … . , # * 6 = C (# 0 − 6) = C # 0 − B6 = 0 01' 01' * # 0 6 = ∑ 0 In other words, MLE is the H population mean of the data. B 5

B samples # ' , … , # * ∈ ℝ from Gaussian P(6, 3 4 ) . Most likely 6 and 3 4 ? 0.5 0.4 0.3 0.2 0.1 0 −2 −1 0 1 3 −4 −3 2 4 6 5 6

Two-parameter optimization Normal outcomes # ' , … , # * Goal: estimate % ' = µ = expectation and % 4 = 3 4 = variance * * : ; < :R S > 1 . # ' , … . , # * % ' , % 4 = / 9 4R > 28% 4 01' ln . # ' , … . , # * % ' , % 4 = * # 0 − % ' 4 = −B ln 28 % 4 − C 2 2% 4 01' 7

Two-parameter estimation * 4 ln . # ' , … . , # * % ' , % 4 = − ln 28 % 4 # 0 − % ' − C 2 2% 4 01' We need to find a solution + % ' , + % 4 to D ln . # ' , … . , # * % ' , % 4 = 0 D% ' D ln . # ' , … . , # * % ' , % 4 = 0 D% 4 8

* # 0 − % ' 4 MLE for Expectation ln . # ' , … . , # * % ' , % 4 = −B ln 28 % 4 − C 2 2% 4 01' * D ln . # ' , … . , # * % ' , % 4 = 1 C (# 0 − % ' ) = 0 D% ' % 4 0 * # 0 In other words, MLE of expectation is % ' = ∑ 0 + (again) the population mean of the B data, regardless of % 4 9 What about the variance?

MLE for Variance * 4 # 0 − T % ' , % 4 = −B ln 28 % 4 % ' ln . # ' , … . , # * T − C 2 2% 4 01' * = −B ln 28 − B ln % 4 − 1 4 # 0 − T C % ' 2 2 2% 4 01' * D − B + 1 4 # 0 − + ln . # ' , … . , # * + = 0 % ' , % 4 = 4 C % ' D% 4 2% 4 2% 4 01' * % U = 1 In other words, MLE of variance is the 4 + # 0 − + B C % ' population variance of the data. 01' 10

So far • We have decided that MLE estimators are always good. • But why is it really the case? – Next: A natural property not always satisfied by MLE – And why MLE is nonetheless “good” 11

When is an estimator good? Parameter estimate Distribution samples Z ' , … , Z * Θ * Algorithm ℙ(#|%) from ℙ(#|%) % = unknown parameter Definition. An estimator is unbiased if for all B ≥ 1 , X Θ * = % . 12

* ] Recall: + Example – Coin Flips % = * Coin-flip outcomes # ' , … , # * , with B [ heads, B \ tails Fact. + % is unbiased Let ^ ' , … , ^ * be s.t. ^ 0 = 1 iff # 0 = _ (and 0 otherwise) In particular ℙ ^ 0 = 1 = % * * Θ = 1 Θ) = 1 X ^ 0 = 1 ` X(` B C ^ 0 B C B B ⋅ % = % 01' 01' 13

Notes • Unbiasedness is not the ultimate goal either – Consider estimator which sets + % = 1 if first coin toss is heads, and + % = 0 otherwise – regardless of number of samples. – ℙ a * = 1 = % – X a * = % • Generally, we would like instead ℙ a * ≈ % with high probability as B → ∞ . – Will discuss this on Monday. – Unbiasedness is a step towards this. 14

Example – Gaussian Normal outcomes Z ' , … , Z * iid according to P(6, 3 4 ) * Θ ' = ∑ 01' Z 0 ` B * Θ 4 = 1 4 ` Z 0 − ` B C Θ ' 01' 15

Example – Gaussian Normal outcomes Z ' , … , Z * iid according to P(6, 3 4 ) * Z 0 Θ ' = ∑ 0 ` B * = B ⋅ 6 Θ ' ) = ∑ 01' X(Z 0 ) Therefore: Unbiased! X(` = 6 B B 16

Example – Gaussian Assume: 3 4 > 0 Normal outcomes Z ' , … , Z * iid according to P(6, 3 4 ) * * Θ 4 = 1 1 4 4 ` Z 0 − ` ` Z 0 − ` B C Θ ' Θ 4 = B − 1 C Θ ' 01' 01' Unbiased! Example: B = 1 Θ 4 = 1 Θ ' = Z ' 1 Z ' − Z ' 4 = 0 X(` ` ` Θ 4 ) = 0 ≠ 3 4 1 = Z ' Next time: Unbiased estimator proof + more Therefore: Biased! intuition + confidence intervals 17

Example – Consistency Assume: 3 4 > 0 Normal outcomes Z ' , … , Z * iid according to P(6, 3 4 ) * * Θ 4 = 1 1 4 4 ` Z 0 − ` ` Z 0 − ` B C Θ ' Θ 4 = B − 1 C Θ ' 01' 01' Population variance – Biased! Sample variance – Unbiased! Left ` Θ 4 converges to same value as right ` Θ 4 , i.e., 3 4 , as B → ∞. Left ` Θ 4 is “consistent” 18

Consistent Estimators & MLE Parameter estimate Distribution samples Z ' , … , Z * Θ * Algorithm ℙ(#|%) from ℙ(#|%) % = unknown parameter Definition. An estimator is unbiased if X Θ * = % for all B ≥ 1 . Definition. An estimator is consistent if lim *→i X Θ j = % . (But not necessarily Theorem. MLE estimators are consistent. unbiased) 19

Foundations of Computing II Lecture 24: Biased Estimation Stefano - PowerPoint PPT Presentation

CSE 312 Foundations of Computing II Lecture 24: Biased Estimation Stefano Tessaro tessaro@cs.washington.edu 1 Parameter Estimation Workflow Parameter estimate Independent Distribution + % Algorithm samples # ' , , # * (#|%)

recap to this point foundations foundations foundations foundations genetics =

Boosting: Foundations and Algorithms Boosting: Foundations and Algorithms Boosting: Foundations

BUILDING THE FOUNDATIONS OF A WORLD BUILDING THE FOUNDATIONS OF A WORLD CLASS BUILDING THE

For personal use only BUILDING THE FOUNDATIONS OF A WORLD BUILDING THE FOUNDATIONS OF A WORLD

For personal use only BUILDING THE FOUNDATIONS OF A WORLD BUILDING THE FOUNDATIONS OF A WORLD

Outline Foundations of Data and Knowledge Systems EPCL Basic Training Camp 2012 3. Foundations

Trustworthy Computing * Reverse engineers agree on that! Trustworthy Computing Trustworthy

Cognitive Foundations Lecture 2: Experimental Methods (2) Foundations of Language Science and

Foundations Track 1: Believer T o Disciple Lesson 13: Financial Stewardship Foundations

Welcome to CSE 311: Foundations of Computing I F Instructor: Rajesh Rao (rao@cs.washington.edu) F

Foundations of Computing II Lecture 1: Welcome & Introduction Stefano Tessaro

COMPUTING COMMUNITY CONSORTIUM The mission of the Computing Research Association's Computing

THE COMPUTING COMMUNITY CONSORTIUM (CCC) COMPUTING COMMUNITY CONSORTIUM The mission of Computing

Calm Computing The Coming Age of Mark Weiser and John Seely Brown Calm Computing Whyfor, Calm

Ray Wu Presentation to School of Computing, National University of Singapore Computing Evolution

ManyCore ManyCore Computing: ManyCore ManyCore Computing: Computing: Computing: The Impact on

Moments of Traces for Circular -ensembles Tiefeng Jiang University of Minnesota This is a

Computation of operators in wavelet coordinates Tsogtgerel Gantumur and Rob Stevenson Department

Moments in Quantum Information Theory Sabine Burgdorf University of Konstanz EWM GM 2018 - Graz

Frequency moments and Counting Distinct Elements Lecture 05 September 8, 2020 Chandra (UIUC)

Logistic Regression: MLE vs. OLS1 in Excel2013 29 Aug 2016 V0B V0B V0B Schield MLE vs.

Homogeneous First Order Equations Bernd Schr oder logo1 Bernd Schr oder Louisiana Tech

Reversibility Processes in nature are always irreversible : far from equilibrium Reversible

for Constrained IoT Devices Y. Jia, B. Liu, W. Jiang, B. Wu, C. Wang IEEE ICNP 2020 Madrid,

Foundations of Computing II Lecture 24: Biased Estimation Stefano - PowerPoint PPT Presentation

CSE 312 Foundations of Computing II Lecture 24: Biased Estimation Stefano Tessaro tessaro@cs.washington.edu 1 Parameter Estimation Workflow Parameter estimate Independent Distribution + % Algorithm samples # ' , , # * (#|%)

recap to this point foundations foundations foundations foundations genetics =

Boosting: Foundations and Algorithms Boosting: Foundations and Algorithms Boosting: Foundations

BUILDING THE FOUNDATIONS OF A WORLD BUILDING THE FOUNDATIONS OF A WORLD CLASS BUILDING THE

For personal use only BUILDING THE FOUNDATIONS OF A WORLD BUILDING THE FOUNDATIONS OF A WORLD

For personal use only BUILDING THE FOUNDATIONS OF A WORLD BUILDING THE FOUNDATIONS OF A WORLD

Outline Foundations of Data and Knowledge Systems EPCL Basic Training Camp 2012 3. Foundations

Trustworthy Computing * Reverse engineers agree on that! Trustworthy Computing Trustworthy

Cognitive Foundations Lecture 2: Experimental Methods (2) Foundations of Language Science and

Foundations Track 1: Believer T o Disciple Lesson 13: Financial Stewardship Foundations

Welcome to CSE 311: Foundations of Computing I F Instructor: Rajesh Rao (rao@cs.washington.edu) F

Foundations of Computing II Lecture 1: Welcome &amp; Introduction Stefano Tessaro

COMPUTING COMMUNITY CONSORTIUM The mission of the Computing Research Association's Computing

THE COMPUTING COMMUNITY CONSORTIUM (CCC) COMPUTING COMMUNITY CONSORTIUM The mission of Computing

Calm Computing The Coming Age of Mark Weiser and John Seely Brown Calm Computing Whyfor, Calm

Ray Wu Presentation to School of Computing, National University of Singapore Computing Evolution

ManyCore ManyCore Computing: ManyCore ManyCore Computing: Computing: Computing: The Impact on

Moments of Traces for Circular -ensembles Tiefeng Jiang University of Minnesota This is a

Computation of operators in wavelet coordinates Tsogtgerel Gantumur and Rob Stevenson Department

Moments in Quantum Information Theory Sabine Burgdorf University of Konstanz EWM GM 2018 - Graz

Frequency moments and Counting Distinct Elements Lecture 05 September 8, 2020 Chandra (UIUC)

Logistic Regression: MLE vs. OLS1 in Excel2013 29 Aug 2016 V0B V0B V0B Schield MLE vs.

Homogeneous First Order Equations Bernd Schr oder logo1 Bernd Schr oder Louisiana Tech

Reversibility Processes in nature are always irreversible : far from equilibrium Reversible

for Constrained IoT Devices Y. Jia, B. Liu, W. Jiang, B. Wu, C. Wang IEEE ICNP 2020 Madrid,

Foundations of Computing II Lecture 1: Welcome & Introduction Stefano Tessaro