Two new approaches to smoothing over complex regions David Lawrence Miller Mathematical Sciences University of Bath useR! 2009, Rennes
Outline Smoothing over complex regions Intro Solutions Schwarz-Christoffel transform Multidimensional Scaling Details Simulation Results Conclusions
Outline Smoothing over complex regions Intro Solutions Schwarz-Christoffel transform Multidimensional Scaling Details Simulation Results Conclusions
Smoothing in 2 dimensions ◮ Have some geographical region and wish to find out something about the biological population in it. ◮ Response is eg. animal distribution, wish to predict based on ( x , y ) and other covariates eg. habitat, size, sex, etc. ◮ This problem is relatively easy if the domain is simple.
Smoothing over complex domains ◮ Smoothing of complex domains makes this a lot more difficult. ◮ Problem of leakage. ◮ Euclidean distance doesn’t always make sense. ◮ Models need to incorporate information about the intrinsic structure of the domain. 3.75 3.25 1 1 2 2 3 0 . . . . . . 2.5 3.75 7 2 7 2 7 2 2.75 5 0.5 5 5 5 5 5 0.5 1 . 4 0.75 5 5 5 5 1 . 2 . 3 . 4 0.25 3 1 2 3 1.25 2 . 2 5 1 0 . 7 5 2 1 . 7 5 0 0 −0.25 −0.5 −1 −0.75 −1.75 −2.75 −1.75 −2 −2.5 −3 −3.25 −0.5 5 5 − 4 . . − −4 1 2 3 −1.25 −3.75 − − −1 . −0.75 − 2 − − − 5 . 2 5 2 2 3 . 7 5 (modified) Ramsay test function Thin plate spline fit
Smoothing with penalties ◮ Objective function takes the form: n � ( z i − f ( x i , y i ; θ )) 2 + λ � Pf ( x , y ; θ ) d Ω Ω i = 1 ◮ f is the function you want to estimate, made up of some combination of basis functions. ◮ P is some squared derivative penalty operator, usually P = ( ∂ 2 ∂ x 2 + ∂ 2 ∂ y 2 ) 2 . ◮ This can be generalized to an additive model or GAM.
Possible solutions to leakage problems ◮ FELSPLINE (Ramsay, (2002).) ◮ Domain morphing (Eilers, (2006).) ◮ Within-area distance (Wang and Ranalli, (2007).) ◮ Soap film smoothers (Wood et al , (2008).)
Why morph the domain? ◮ Takes into account within-area distance. ◮ Gives a known domain that is easier to smooth over. ◮ Potentially less computationally intensive. However: ◮ Don’t maintain isotropy - distribution of points odd. ◮ Not clear what this does to the smoothness penalty. 4 2 16 7 3 15 4 5 3 6 6 2 5 7 14 8 17 12 10 8 6 4 4 1 9 3 10 13 11 9 7 3 18 5 0 11 12 2 13 −1 1 1617 15 14 −2 2 1 0 1 18 −2 0 2 −4 −2 0 2 4
Outline Smoothing over complex regions Intro Solutions Schwarz-Christoffel transform Multidimensional Scaling Details Simulation Results Conclusions
The Schwarz-Christoffel transform ◮ Take a polygon in some domain W and morph it to a new domain W ∗ . ◮ We then have a function for the mapping, ϕ ( x , y ) . ◮ ϕ ( x , y ) is a conformal mapping. ◮ Do this by starting at the new domain and working back to the polygon. ◮ Can draw a polygonal bounding box around some arbitrary shape. φ (x) W* W φ (x) -1
The mapping ◮ Use a bounding box around the horseshoe. 1 2 7 8 5 6 4 3 ◮ Morphing the horseshoe shape still gives a slightly odd domain however, we are still doing better than before.
−4 −4 4 4 . 7 5 −3.75 − 3 3.75 3.75 . 5 −3.5 − 3 3.5 3.5 −3.25 −3.25 5 5 3 . 2 3 . 2 3 − −3 3 3 −2.75 2.75 2.75 −2.75 2.5 −2.5 Soap film −2.5 2.5 SC+PS 5 −2 2 2 . 2 2 . 2 5 − 2 −1.75 2 −1.75 1.75 1.75 −1.5 5 1 . − 1.5 1.5 −1.25 −1.25 1.25 1.25 −1 1 − 1 1 −0.75 5 7 . 0 − 5 . 7 0.5 0 0.75 5 −0.5 2 − 0 . 5 0 . 0 0.5 − 5 0.25 0 2 . 0 − 4 −4 4 4 −3.75 − 3 . 7 5 5 7 Horseshoe plots 7 5 . 3 3 . −3.5 − 3 . 5 3.5 3.5 2 5 −3.25 − 3 . 5 3.25 3 . 2 −3 −3 3 3 5 7 . 2 − 5 7 . 2 − 5 2 . 7 2.75 −2.5 SC+TPRS −2.5 2.5 2 . 5 2 5 Truth − 2 . 5 5 2 . 2 2 . 2 2 − −2 2 2 −1.75 5 7 . 1 − 5 5 7 . 1 . 7 1 −1.5 −1.5 1.5 5 . 1 −1.25 2 5 − 1 . 5 1.25 . 2 1 −1 1 − 1 1 −0.75 5 7 . 0 − 0.75 0.75 −0.25 0.5 −0.5 −0.5 0 5 0.25 . 0 0 5 2 . 0
Problems ◮ Small: ◮ Implementation is Matlab+R. (YUCK!) ◮ BIG : ◮ Weird artifacts. ◮ Morphing of domain appears to cause features to be smoothed over. ◮ Arbitrary selection of vertices.
A more realistic domain truth sc+tprs prediction 0.8 0.8 y y 0.4 0.4 0.0 0.0 0.0 0.4 0.8 0.0 0.4 0.8 x x tprs prediction soap prediction 0.8 0.8 y y 0.4 0.4 0.0 0.0 0.0 0.4 0.8 0.0 0.4 0.8 x x
A more realistic domain - what’s happening? ◮ Weird “crowding” effect. ◮ Different with each vertex choice. All bad.
Outline Smoothing over complex regions Intro Solutions Schwarz-Christoffel transform Multidimensional Scaling Details Simulation Results Conclusions
Multidimensional scaling and within-area distances ◮ Idea: use MDS to to arrange points in the domain according to their “within-domain distance.” Scheme: ◮ First need to find the within-area distances. ◮ Perform MDS on the matrix of within-area distances. ◮ Smooth over the new points.
Multidimensional scaling refresher ◮ Double centre matrix of between point distances, D , (subtract row and column means) then find DD T . ◮ Finds a configuration of points such that Euclidean distance between points in new arrangement is approximately the same as distance in the domain. ◮ Already implemented in R by cmdscale . 3 ● ● 3 2 ● ● ● ● 2 ● ● ● 1 newcoords[,2] ● 1 ● ● y 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● −1 ● ● ● ● ● ● ● ● ● ● ● ● −2 ● ● ● −1 ● ● ● ● ● ● ● ● ● ● ● ● ● ● −3 ● ● ● ● ● −3 −2 −1 0 1 2 3 −4 −2 0 2 4 6 x newcoords[,1]
Finding within-area distances ◮ Use a new algorithm to find the within area distances. 4 4 3 3 y 2 y 2 1 1 0 0 1 2 3 4 5 6 1 2 3 4 5 6 x x 4 4 3 3 y 2 y 2 1 1 0 0 1 2 3 4 5 6 1 2 3 4 5 6 x x
Ramsay simulations truth MDS 1.0 1.0 0.5 0.5 0.0 0.0 y y −0.5 −0.5 −1.0 −1.0 −1 0 1 2 3 −1 0 1 2 3 x x tprs soap 1.0 1.0 0.5 0.5 y 0.0 y 0.0 −0.5 −0.5 −1.0 −1.0 −1 0 1 2 3 −1 0 1 2 3 x x
A different domain truth mds 2 2 1 1 0 0 y y −1 −1 −2 −2 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 x x tprs soap 2 2 1 1 y 0 y 0 −1 −1 −2 −2 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 x x
Outline Smoothing over complex regions Intro Solutions Schwarz-Christoffel transform Multidimensional Scaling Details Simulation Results Conclusions
Conclusions ◮ Seems that the S-C transform does not have much utility. ◮ MDS shows more promise, easier to transfer to higher dimensions. ◮ MDS does not impose strict boundary conditions so leakage still possible. ◮ Pushing the data into more dimensions might be useful to separate points. ◮ After initial “transform” calculation, both methods only use the same computational time as a thin plate regression spline. (Soap is expensive.)
References ◮ S.N. Wood, M.V. Bravington, and S.L. Hedley. Soap film smoothing. JRSSB, 2008 ◮ H. Wang and M.G. Ranalli. Low-rank smoothing splines on complicated domains. Biometrics, 2007 ◮ T.A. Driscoll and L.N. Trefethen. Schwarz-Christoffel Mapping. Cambridge, 2002 ◮ T. Ramsay. Spline smoothing over difficult regions. JRSSB, 2001 ◮ P .H.C. Eilers. P-spline smoothing on difficult domains. University of Munich seminar, 2006 ◮ J.C. Gower. Adding a point to vector diagrams in multivariate analysis. Biometrika, 1968. Slides available at http://people.bath.ac.uk/dlm27
Recommend
More recommend