functional median polish with climate applications
play

Functional Median Polish, with Climate Applications Marc G. Genton - PowerPoint PPT Presentation

Functional Median Polish Motivation Univariate ANOVA Functional ANOVA Simulation Studies Applications Discussion Functional Median Polish, with Climate Applications Marc G. Genton Department of Statistics, Texas A&M University Program


  1. Functional Median Polish Motivation Univariate ANOVA Functional ANOVA Simulation Studies Applications Discussion Functional Median Polish, with Climate Applications Marc G. Genton Department of Statistics, Texas A&M University Program in Spatial Statistics (stat.tamu.edu/pss) Institute for Applied Mathematics and Computational Sciences (iamcs.tamu.edu) Based on joint work with Ying Sun May 11, 2012 Marc G. Genton Functional Median Polish, with Climate Applications

  2. Functional Median Polish Motivation Univariate ANOVA Functional ANOVA Simulation Studies Applications Discussion Functional Median Polish 1 Motivation 2 Univariate ANOVA 3 Functional ANOVA 4 Simulation Studies 5 Applications 6 Discussion Marc G. Genton Functional Median Polish, with Climate Applications

  3. Functional Median Polish Motivation Univariate ANOVA Functional ANOVA Simulation Studies Applications Discussion Observations and Climate Models Observations: provide a corroborating source of information about physical processes being modeled. have methodological and practical issues due to uncertainties. Climate Models: numerically solve systems of differential equations representing physical relationships in the climate system. have huge uncertainties and biases. Scientific Questions: How do we compare sources of variability in observations or climate model outputs? i.e. quantification of uncertainties? Marc G. Genton Functional Median Polish, with Climate Applications

  4. Spatio-Temporal Precipitation Data Spatio-temporal precipitation data: annual total precipitation data for U.S. from 1895 to 1997 at 11,918 weather stations. Nine climatic regions for precipitation defined by National Climatic Data Center. Several areas of outliers detected by Sun and Genton (2011, 2012). + + + + + + + + + + + + + + + + + + ++ + + + + + + + + + + + + + ++ + + NW WNC ENC + + + + ++ + + + + + + + + NE + + + + + + + ++ + ++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + C + + + + + W + + + + + + + + + + SW + + + ++ ++ + + + + + + + + + + + + + + + + + + S SE + + + + + + + + +

  5. Functional Median Polish Motivation Univariate ANOVA Functional ANOVA Simulation Studies Applications Discussion Analysis of Variance Analysis of Variance (ANOVA): An important technique for analyzing the effect of categorical factors on a response. It decomposes the variability in the response variable among the different factors. A two-way additive model: for i = 1 , . . . , r , j = 1 , . . . , c , y ij = µ + α i + β j + ǫ ij . The ANOVA model can be fitted by arithmetic means (no outliers), or medians (robust). Marc G. Genton Functional Median Polish, with Climate Applications

  6. Functional Median Polish Motivation Univariate ANOVA Functional ANOVA Simulation Studies Applications Discussion ANOVA Model Fitting Fitted by means: µ = ¯ ˆ y (grand effect), α i = ¯ ˆ y i · − ¯ y (row effect), ˆ β j = ¯ y · j − ¯ y (column effect). Fitted by medians: Median polish (Tukey, 1970, 1977). An iterative technique for extracting row and column effects in a two-way table using medians rather than means. It stops when no more changes occur in the row and column effects, or changes are sufficiently small. Marc G. Genton Functional Median Polish, with Climate Applications

  7. Median Polish Example Original table: find row medians. 1 1st iteration: subtract row medians, find column medians. Grand median 2 in red, row effects in blue, column effects in green. 2nd iteration: subtract column medians, find row medians, 3 0 − 3 5 6 0 − 2 4 0 3 6 3 11 6 0 − 1 1 3 0 0 0 0 0 3 2 4 3 → → → 9 0 0 0 9 1 − 1 1 − 3 9 0 0 0 0 − 1 1 3 0 − 1 1 0 3 subtract new row medians, add their medians to the grand median, find 4 column medians. Polished table: new row and column medians are zero after two iterations. 5 0 − 2 4 3 0 − 2 4 3 0 0 0 0 0 0 0 0 8 0 − 2 − 3 → 8 0 − 2 − 3 0 0 0 0 0 − 1 1 3 0 − 1 1 3 + 0

  8. Functional Median Polish Observe functional data at each combination of two categorical factors. Examine their effects: functional row or column effects. y ijk ( x ) = µ ( x ) + α i ( x ) + β j ( x ) + ǫ ijk ( x ), where i = 1 , . . . , r , j = 1 , . . . , c , k = 1 , . . . , m ij . Constraints: median i { α i ( x ) } = 0, median j { β j ( x ) } = 0 and median i { ǫ ijk ( x ) } = median j { ǫ ijk ( x ) } = 0 for all k . x can be time for curves or spatial index for surfaces/images. Iterative procedure sweeping out column and row medians. One-way functional ANOVA can be done in a similar way. Need to order functional data.

  9. Functional Median Polish Motivation Univariate ANOVA Functional ANOVA Simulation Studies Applications Discussion Multivariate Ordering Basic ideas of depth in functional context provides a method to order sample curves according to decreasing depth values, y [1] : the deepest (most central or median) curve, y [ n ] : the most outlying (least representative) curve, y [1] , . . . , y [ n ] : start from the center outwards. Usual order statistics: ordered from the smallest sample value to the largest. Marc G. Genton Functional Median Polish, with Climate Applications

  10. Band Depth for Functional Data L´ opez-Pintado and Romo (2009) introduced the band depth (BD) concept through a graph-based approach. Grey area: band determined by two curves, y 1 and y 3 . Contains the curve y 2 , but does not contain y 4 . 4 3 y 2 2 y 1 1 y 3 y(t) 0 y 4 −1 −2 −3 0.0 0.2 0.4 0.6 0.8 1.0 t

  11. Functional Median Polish Motivation Univariate ANOVA Functional ANOVA Simulation Studies Applications Discussion Band Depth for Functional Data Population version of BD (2) : BD (2) ( y , P ) = P { G ( y ) ⊂ B ( Y 1 , Y 2 ) } . G ( y ): graph of the curve y , B ( Y 1 , Y 2 ): band delimited by 2 random curves. The band could be delimited by more than 2 random curves, J � BD ( j ) ( y , P ) . BD J ( y , P ) = j =2 Marc G. Genton Functional Median Polish, with Climate Applications

  12. Functional Median Polish Motivation Univariate ANOVA Functional ANOVA Simulation Studies Applications Discussion Sample Band Depth Population level: BD ( j ) ( y , P ) is a probability. Sample version of BD ( j ) ( y , P ) � − 1 � n BD ( j ) � n ( y ) = I { G ( y ) ⊆ B ( y i 1 , . . . , y i j ) } , j 1 ≤ i 1 < i 2 <...< i j ≤ n I {·} : the indicator function, fraction of the bands completely containing the curve y . j =2 BD ( j ) Sample BD: BD n , J ( y ) = � J n ( y ) . Marc G. Genton Functional Median Polish, with Climate Applications

  13. Functional Median Polish Motivation Univariate ANOVA Functional ANOVA Simulation Studies Applications Discussion Modified Band Depth L´ opez-Pintado and Romo (2009) also proposed a more flexible definition, the modified band depth (MBD). � − 1 � n BD ( j ) � n ( y ) = I { G ( y ) ⊆ B ( y i 1 , . . . , y i j ) } , j 1 ≤ i 1 < i 2 <...< i j ≤ n � − 1 � n MBD ( j ) � n ( y ) = λ r { A ( y ; y i 1 , . . . , y i j ) } . j 1 ≤ i 1 < i 2 <...< i j ≤ n λ r { A ( y ; y i 1 , . . . , y i j ) } measures the proportion of time that a curve y is in the band. Marc G. Genton Functional Median Polish, with Climate Applications

  14. Functional Boxplots (Sun and Genton, 2011, 2012) 30 Functional Boxplot 28 Sea Surface Temperature 30 1998 26 28 24 1983 Sea Surface Temperature 26 22 1997 24 20 1982 22 18 2 4 6 8 10 12 20 Month 18 2 4 6 8 10 12

  15. Surface Boxplot

  16. Functional Median Polish Motivation Univariate ANOVA Functional ANOVA Simulation Studies Applications Discussion True Model Generate data from a true model with r = 2, c = 3, and m = 100 curves in each cell at p = 50 time points. True Grand Effect True Row Effect True Column Effect 4 4 1.0 3 2 0.5 Column Effect Grand Effect Row Effect 0.0 2 0 −0.5 −2 1 −1.0 −4 0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 x x x Introduce outliers through a Gaussian process ǫ ijk ( t ). Replications: 1,000. Marc G. Genton Functional Median Polish, with Climate Applications

  17. Outlier Models Model 1: ǫ ijk ( t ) = e ijk ( t ), where e ijk ( t ) ∼ GP (0 , γ ) with γ ( t 1 , t 2 ) = exp {−| t 2 − t 1 |} . Model 2: ǫ ijk ( t ) = e ijk ( t ) + c ijk K , where c ijk is 1 with prob q ij and 0 with prob 1 − q ij , q ij is different for each cell. Model 3: ǫ ijk ( t ) = e ijk ( t ) + c ijk K , if t ≥ T ijk and ǫ ijk ( t ) = e ijk ( t ), if t < T ijk , where T ijk ∼ U (0 , 1). Model 1 Model 2 Model 3 25 25 25 20 20 20 15 15 15 y 10 y 10 y 10 5 5 5 0 0 0 −5 −5 −5 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 t t t

  18. Simulations: Model 1 Model 1 (Median) Model 1 (Median) Model 1 (Median) 8 2 3 6 1 2 Grand Effect Row Effect 1 Row Effect 2 4 0 1 −1 2 0 −2 −1 0 −2 −3 −2 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 t t t Model 1 (Mean) Model 1 (Mean) Model 1 (Mean) 8 2 3 2 6 1 1 Grand Effect Row Effect 1 Row Effect 2 4 0 0 −1 2 −1 −2 0 −2 −2 −3 −3 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 t t t

Recommend


More recommend