Statistical Geometry Processing Winter Semester 2011/2012 Least-Squares
Least-Squares Fitting
Approximation Common Situation: • We have many data points, they might be noisy • Example: Scanned data • Want to approximate the data with a smooth curve / surface What we need: • Criterion – what is a good approximation? • Methods to compute this approximation 3
Approximation Techniques Agenda: • Least-squares approximation (and why/when this makes sense) • Total least-squares linear approximation (get rid of the coordinate system) • Iteratively reweighted least-squares (for nasty noise distributions) 4
Least-Squares We assume the following scenario: • Given: Function values y i at positions x i . (1D 1D for now) • Independent variables x i known exactly. • Dependent variables y i with some error. • Error Gaussian, i.i.d. normal distributed independent same distribution at every point • We know the class of functions 5
Situation y 1 y 2 y n x n x 1 x 2 Situation: • Original sample points taken at x i from original f . • Unknown Gaussian iid noise added to each y i . ~ • Want to estimated reconstructed f . 6
Summary Statistical model yields least-squares criterion: ~ n 2 arg min ( f ( x ) y ) i i ~ f i 1 Linear function space leads to quadratic objective: 2 ~ k n k f x : b x arg min b ( x ) y j j j j i i λ i 1 j 1 j 1 Critical point: linear system n b , b : b ( x ) b ( x ) b , b b , b y , b i j i t j t 1 1 1 k 1 1 t 1 with: n y , b : b ( x ) y b , b b , b y , b i i t t k 1 k k k k t 1 7
Maximum Likelihood Estimation Goal: • Maximize the probability that the data originated from ~ the reconstructed curve f . • “Maximum likelihood estimation” 2 1 x p ( x ) exp , 2 2 2 π Gaussian normal distribution 8
Maximum Likelihood Estimation ~ 2 n ~ n 1 ( f ( x ) y ) arg max N ( f ( x ) y ) arg max exp i i 0 , i i 2 2 ~ ~ 2 π f f i 1 i 1 ~ 2 n ( f ( x ) y ) 1 arg max ln exp i i 2 2 ~ 2 π f i 1 ~ 2 ( f ( x ) y ) n 1 arg max ln i i 2 2 ~ 2 π f i 1 ~ 2 ( f ( x ) y ) n arg min i i 2 2 ~ f i 1 n ~ 2 arg min ( f ( x ) y ) i i ~ f i 1 9
Maximum Likelihood Estimation ~ 2 n ~ n 1 ( f ( x ) y ) arg max N ( f ( x ) y ) arg max exp i i 0 , i i 2 2 ~ ~ 2 π f f i 1 i 1 ~ 2 n ( f ( x ) y ) 1 arg max ln exp i i 2 2 ~ 2 π f i 1 ~ 2 ( f ( x ) y ) n 1 arg max ln i i 2 2 ~ 2 π f i 1 ~ 2 ( f ( x ) y ) n arg min i i 2 2 ~ f i 1 n ~ 2 arg min ( f ( x ) y ) i i ~ f i 1 10
Least-Squares Approximation This shows: • Maximum likelihood estimate minimizes sum of squared errors Next: Compute optimal coefficients ~ k f x : b x • Linear ansatz: j j j 1 • Determine optimal i 11
Maximum Likelihood Estimation b ( x ) b ( x ) y 1 1 i 1 1 λ : , b ( x ) : , b : y : k entries k entries n entries, n entries i b ( x ) b ( x ) y k k i n n 2 n ~ n k 2 arg min ( f ( x ) y ) arg min b ( x ) y i i j j i i λ λ i 1 i 1 j 1 n 2 T arg min λ b ( x ) y i i λ i 1 n n n 2 T T T x x y x y arg min λ b ( ) b ( ) λ 2 λ b ( ) i i i i i λ i 1 i 1 i 1 x T Ax bx c Quadratic optimization problem 12
Critical Point b ( x ) b ( x ) y 1 1 i 1 1 λ : , b ( x ) : , b : y : k entries k entries n entries, n entries i b ( x ) b ( x ) y k k i n n n n n 2 T T T λ b ( x ) b ( x ) λ 2 y λ b ( x ) y λ i i i i i i 1 i 1 i 1 T y b 1 n T 2 b ( x ) b ( x ) λ 2 i i i 1 T y b k We obtain a linear system of equations: T y b 1 n T b ( x ) b ( x ) λ i i i 1 T y b k 13
Critical Point This can also be written as: b , b b , b y , b 1 1 1 k 1 1 b , b b , b y , b k 1 k k k k with: n b , b : b ( x ) b ( x ) i j i t j t t 1 n y , b : b ( x ) y i i t t t 1 14
Summary (again) Statistical model yields least-squares criterion: ~ n 2 arg min ( f ( x ) y ) i i ~ f i 1 Linear function space leads to quadratic objective: 2 ~ k n k f x : b x arg min b ( x ) y j j j j i i λ i 1 j 1 j 1 Critical point: linear system n b , b : b ( x ) b ( x ) b , b b , b y , b i j i t j t 1 1 1 k 1 1 t 1 with: n y , b : b ( x ) y b , b b , b y , b i i t t k 1 k k k k t 1 15
Variants Weighted least squares: • In case the data point’s noise has different standard deviations at the different data points • This gives a weighted least squares problem • Noisier points have smaller influence 16
Same procedure as prev. slides... ~ 2 n ~ n ( f ( x ) y ) 1 arg max N ( f ( x ) y ) arg max exp i i i i 2 ~ ~ 2 2 π f f i 1 i 1 i i ~ 2 n ( f ( x ) y ) 1 arg max log exp i i 2 ~ 2 2 π f i 1 i i ~ 2 n 1 ( f ( x ) y ) arg max log i i 2 ~ 2 2 π f i 1 i i ~ 2 n ( f ( x ) y ) arg min i i 2 ~ 2 f i 1 i n 1 ~ 2 arg min ( f ( x ) y ) i i 2 ~ f i 1 i weights 17
Recommend
More recommend