estimating mean and
play

Estimating Mean and Need to Consider . . . Variance under Interval - PowerPoint PPT Presentation

Statistical Analysis in . . . Statistical Analysis: . . . Need to Take Interval . . . Case of Interval . . . Estimating Mean and Need to Consider . . . Variance under Interval Simplest Case: . . . Efficient Algorithm for . . . Uncertainty:


  1. Statistical Analysis in . . . Statistical Analysis: . . . Need to Take Interval . . . Case of Interval . . . Estimating Mean and Need to Consider . . . Variance under Interval Simplest Case: . . . Efficient Algorithm for . . . Uncertainty: Dynamic Case Efficient Algorithm for . . . Computing the Range . . . Rafik Aliev 1 and Vladik Kreinovich 2 Home Page Title Page 1 Dept. of Computer Aided Control Systems Azerbaijan State Oil Academy ◭◭ ◮◮ Azadlig Ave. 20, AZ1010 Baki, Azerbaijan ◭ ◮ raliev@asoa.edu.az Page 1 of 17 2 Department of Computer Science University of Texas at El Paso Go Back 500 W. University, El Paso, TX 79968, USA vladik@utep.edu Full Screen Close Quit

  2. Statistical Analysis in . . . Statistical Analysis: . . . 1. Statistical Analysis in Gaussian Case: Reminder Need to Take Interval . . . • Standard methods for estimating the mean E and the Case of Interval . . . variance V assume normal distribution: Need to Consider . . . � � − ( x − E ) 2 1 Simplest Case: . . . ρ N ( x ) = √ · exp . 2 V 2 π · V Efficient Algorithm for . . . • Normal distributions are ubiquitous, due to the Central Efficient Algorithm for . . . Limit Theorem: sum of many small factors ≈ ρ N ( x ). Computing the Range . . . Home Page • It is usually assumed that different sample values are Title Page independent, so � � n n � � − ( x i − E ) 2 1 ◭◭ ◮◮ L = ρ N ( x i ) = √ · exp . 2 V 2 π · V ◭ ◮ i =1 i =1 • It is reasonable to select the Maximum Likelihood ( most Page 2 of 17 probable ) values E and V s.t. L → max, then: Go Back n n � � E = 1 V = 1 Full Screen ( x i − E ) 2 . n · x i ; n · i =1 i =1 Close Quit

  3. Statistical Analysis in . . . Statistical Analysis: . . . 2. Statistical Analysis: General Case Need to Take Interval . . . • Often, distributions are non-Gaussian; Gaussian-generated Case of Interval . . . estimated are used in the general case as well: Need to Consider . . . Simplest Case: . . . n n � � E = 1 V = 1 ( x i − E ) 2 . n · x i ; n · Efficient Algorithm for . . . i =1 i =1 Efficient Algorithm for . . . Computing the Range . . . • Justification: the mean E [ x ] is the limit of the expres- n Home Page � sion 1 n · x i when n → ∞ . Title Page i =1 ◭◭ ◮◮ • So, for large n , this expression is a good approximation for E [ x ]; the larger n , the better the approximation. ◭ ◮ Page 3 of 17 • Similarly, the Gaussian expression for V tends to the actual variance V [ x ]. Go Back • Caution: for non-Gaussian distributions, the above es- Full Screen timates are not necessarily optimal . Close Quit

  4. Statistical Analysis in . . . Statistical Analysis: . . . 3. Need to Take Interval Uncertainty into Account Need to Take Interval . . . • In practice, the values x i come from measurements, Case of Interval . . . and measurements are never 100% accurate: � x i � = x i . Need to Consider . . . Simplest Case: . . . • Sometimes, we know the probabilities of different val- def Efficient Algorithm for . . . ues of measurement errors ∆ x i = � x i − x i Efficient Algorithm for . . . • However, in many cases, we only know the upper bound Computing the Range . . . ∆ i on the measurement error: | ∆ x i | ≤ ∆ i . Home Page def • In this case, we know that x i ∈ x i = [ � x i − ∆ i , � x i + ∆ i ]. Title Page ◭◭ ◮◮ • Different values x i from these intervals lead, in general, to different estimates of E ( x 1 , . . . , x n ) and V ( x 1 , . . . , x n ). ◭ ◮ • It is therefore desirable to find the ranges Page 4 of 17 Go Back E = [ E, E ] = { E ( x 1 , . . . , x n ) | x 1 ∈ x 1 , . . . , x n ∈ x n } and Full Screen V = [ V , V ] = { V ( x 1 , . . . , x n ) | x 1 ∈ x 1 , . . . , x n ∈ x n } . Close Quit

  5. Statistical Analysis in . . . Statistical Analysis: . . . 4. Case of Interval Uncertainty: What Is Known Need to Take Interval . . . • Estimating the range of a function under interval un- Case of Interval . . . certainty is known as interval computations . Need to Consider . . . n � Simplest Case: . . . • The mean E ( x 1 , . . . , x n ) = 1 n · x i is an increasing Efficient Algorithm for . . . i =1 Efficient Algorithm for . . . function of each of its variables x 1 , . . . , x n , hence: � � Computing the Range . . . n n � � 1 x i , 1 Home Page [ E, E ] = n · n · x i . Title Page i =1 i =1 ◭◭ ◮◮ • For variance V , the situation is more complex: ◭ ◮ – the lower endpoint V can be computed in feasible Page 5 of 17 time; – in general, computing V is NP-hard; Go Back – for some practically useful situations, there exist Full Screen efficient algorithms for computing V . Close Quit

  6. Statistical Analysis in . . . Statistical Analysis: . . . 5. Need to Consider Dynamic Estimates Need to Take Interval . . . • In practice, processes are dynamic: means and vari- Case of Interval . . . ances change with time. Need to Consider . . . Simplest Case: . . . • Reasonable estimates should assign more weight to more Efficient Algorithm for . . . recent measurements x 1 , . . . and less to the past ones. Efficient Algorithm for . . . • For each function y ( x ), we thus take the weighted mean Computing the Range . . . n n � � Home Page E [ y ] ≈ w i · y ( x i ); w i ≥ 0 w i = 1 . Title Page i =1 i =1 ◭◭ ◮◮ • In particular, for E [ x ] and V = E [( x − E ) 2 ], we take ◭ ◮ n n � � w i · ( x i − E ) 2 . Page 6 of 17 E = w i · x i ; V = i =1 i =1 Go Back • What we do: we extend known algorithms for comput- Full Screen ing the ranges E and V to such dynamic estimates. Close Quit

  7. Statistical Analysis in . . . Statistical Analysis: . . . 6. Simplest Case: Estimates for the Mean Need to Take Interval . . . • Since all the weights are non-negative, the function Case of Interval . . . � n Need to Consider . . . E = w i · x i is an increasing function of all x i . i =1 Simplest Case: . . . • Thus: Efficient Algorithm for . . . Efficient Algorithm for . . . – the smallest possible value E is attained when we Computing the Range . . . take the smallest possible values x i = x i , and Home Page – the largest possible value E is attained when we Title Page take the largest possible values x i = x i . ◭◭ ◮◮ • So, the desired range of E has the form � n � ◭ ◮ n � � [ E, E ] = w i · x i , w i · x i . Page 7 of 17 i =1 i =1 Go Back Full Screen Close Quit

  8. Statistical Analysis in . . . Statistical Analysis: . . . 7. Efficient Algorithm for Computing V Need to Take Interval . . . • We sort all endpoints x i and x i : Case of Interval . . . Need to Consider . . . r 1 ≤ r 2 ≤ . . . ≤ r 2 n − 1 ≤ r 2 n . Simplest Case: . . . • Thus, the real line is divided into 2 n +1 zones [ r k , r k +1 ], Efficient Algorithm for . . . with k = 0 , 1 , . . . , 2 n ( r 0 = −∞ and r 2 n +1 = + ∞ ). Efficient Algorithm for . . . • For each zone, we compute E k = N k , where Computing the Range . . . D k Home Page � � � � def N k = w i · x i + w j · x j ; D k = w i + w j . Title Page i : x i ≤ r k j : r k +1 ≤ x j i : x i ≤ r k j : r k +1 ≤ x j ◭◭ ◮◮ • If E k �∈ [ r k , r k +1 ], we move to the next zone. ◭ ◮ • If E k ∈ [ r k , r k +1 ], we compute V k = M k − D k · E 2 k , where � � Page 8 of 17 w i · ( x i ) 2 + w j · ( x j ) 2 . M k = Go Back i : x i ≤ r k j : r k +1 ≤ x j Full Screen • The smallest of the corresponding values V k is the de- sired smallest value V . Close Quit

  9. Statistical Analysis in . . . Statistical Analysis: . . . 8. Computation Time of This Algorithm Need to Take Interval . . . • Sorting takes time O ( n log log( n )). Case of Interval . . . Need to Consider . . . • Computing the sums D 0 , N 0 , M 0 corresponding to the Simplest Case: . . . first zone take linear time O ( n ). Efficient Algorithm for . . . • Each new sum is obtained from the previous one by Efficient Algorithm for . . . changing a few terms which go from x i to x i . Computing the Range . . . • Each value x i changes only once, so we only need to- Home Page tally linear time to compute all these sums. Title Page • We also need linear time to perform all the auxiliary ◭◭ ◮◮ computations. ◭ ◮ • Thus, the total computation time is Page 9 of 17 O ( n · log( n )) + O ( n ) + O ( n ) = O ( n · log( n )) . Go Back • This time can be reduced to O ( n ) if, instead of sorting, Full Screen we use the O ( n ) algorithm for computing the median. Close Quit

Recommend


More recommend