Introduction One density Several densities Conclusions References Optimal representation of bivariate density functions by level sets Pedro Delicado Universitat Polit` ecnica de Catalunya, Barcelona, Spain Philippe Vieu Universit´ e Paul Sabatier, Toulouse, France 7th Journ´ es Statistiques du Sud Barcelona, June 2014 Optimal level sets for bivariate densities 1/52 Pedro Delicado and Philippe Vieu
Introduction One density Several densities Conclusions References Aim of the talk (a) (b) • Bivariate density functions are usually represented by a few f(x,y) f(x,y) level sets, for instance those with probability content equal y y x x to .25, .5 and .75. • In this work we deal with (c) (d) 3 3 choosing which level sets 0.95 2 2 0 7 . 5 0.5 provide the best graphical 1 1 0 . 1 0 . 1 0 0 0.1 representation of a single 0.25 0 . 2 5 −1 −1 0 . 5 0 . 7 5 −2 −2 bivariate density, according to 0.95 −3 −3 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 certain optimality criteria. Optimal level sets for bivariate densities 2/52 Pedro Delicado and Philippe Vieu
Introduction One density Several densities Conclusions References Outline 1 Introduction 2 Optimal level sets for a single density Optimality based on distances between density level sets Optimality based on distances between bivariate densities Difficulties with managing several densities 3 Optimal representation of several densities by level sets Representation with a single level set Representation with more than a level set Case of estimated densities Some Monte Carlo experiments 4 Conclusions Optimal level sets for bivariate densities 3/52 Pedro Delicado and Philippe Vieu
Introduction One density Several densities Conclusions References 1 Introduction 2 Optimal level sets for a single density Optimality based on distances between density level sets Optimality based on distances between bivariate densities Difficulties with managing several densities 3 Optimal representation of several densities by level sets Representation with a single level set Representation with more than a level set Case of estimated densities Some Monte Carlo experiments 4 Conclusions Optimal level sets for bivariate densities 4/52 Pedro Delicado and Philippe Vieu
Introduction One density Several densities Conclusions References Introduction • Let f be a bivariate probability density function. • For α ∈ ]0 , 1[, the density level set with probability content α is C α = { x ∈ R 2 : f ( x ) ≥ γ α } , � where γ α is such that C α f ( x ) dx = α . • A standard way to graphically represent the bivariate density f is by drawing in the same graphic density level sets corresponding to several values α 1 , . . . , α J , or just their boundaries. • Problem: Given a bivariate density function f (respectively, N densities f 1 , . . . , f N ) choose α 1 , . . . , α J defining the best (in a sense to be specified) graphical representation of f (resp., .f 1 , . . . , f N ). Optimal level sets for bivariate densities 5/52 Pedro Delicado and Philippe Vieu
Introduction One density Several densities Conclusions References Why is it worth choosing α 1 , . . . , α J carefully? • A usual way to represent a bivariate density function is by plotting J = 3 of its density level sets, those corresponding to α = 1 / 4, 1 / 2 and 3 / 4 (by analogy with the univariate boxplots). • Bowman and Azzalini (1997) call these plots ’sliceplots’. • A relevant question is to know whether the choice of α = 1 / 4, 1 / 2 and 3 / 4 is sensible or maybe there exists an alternative better choice. Optimal level sets for bivariate densities 6/52 Pedro Delicado and Philippe Vieu
Introduction One density Several densities Conclusions References Why is it worth choosing α 1 , . . . , α J carefully? (Cont.) • An additional important reason: Representing each of N bivariate density functions by a unique density level set C α ( J = 1) allows us to draw in the same graphic more than one bivariate density function. • So it is worthwhile to make a good choice of α . • This kind of graphics is helpful in different situations, as the following examples illustrate. Optimal level sets for bivariate densities 7/52 Pedro Delicado and Philippe Vieu
Introduction One density Several densities Conclusions References Example: Aircraft data (Bowman and Azzalini 1997) • Bowman and Azzalini (1997) study six characteristics of 709 aircraft designs from periods 1914-1935, 1936-1955 and 1956-1984. • They obtain the first two principal components (identified as “size” and “speed adjusted by size”, respectively) and represent their joint density using only a level plot ( α = 0 . 75) for each period. 3 • A single graphic 2 75 summarizes the way 1 Comp.2 aircraft designs 0 evolved over the last 7 5 −1 century. 7 5 −2 Period: 1914 − 1935 Period: 1936 − 1955 Period: 1956 − 1984 −4 −2 0 2 4 6 Comp.1 Optimal level sets for bivariate densities 8/52 Pedro Delicado and Philippe Vieu
Introduction One density Several densities Conclusions References Dynamic representation of many density functions • Consider the case when the number of bivariate density functions to be represented is large. • Assume that they are sorted according to the time they were observed and that the elapsed time between two consecutive densities is short. • A convenient way to represent them is by an animated graphic, where each image corresponds to the graphic of each bivariate density. • In this case it is appropriate to represent each density by a few (3, for instance) density level sets. • The animated graphic is showing how the level sets evolve over time. Optimal level sets for bivariate densities 9/52 Pedro Delicado and Philippe Vieu
Introduction One density Several densities Conclusions References Example: Aircraft data. Animated graphic. 3 2 1 Comp.2 0 2 5 −1 2 5 50 7 5 −2 Period: 1914 − 1935 −4 −2 0 2 4 6 Comp.1 Optimal level sets for bivariate densities 10/52 Pedro Delicado and Philippe Vieu
Introduction One density Several densities Conclusions References Showing results of FPCA for bivariate densities • Assume that a functional principal component analysis (FPCA) is performed from a sample of bivariate densities f 1 , . . . , f N . • In FPCA for one-dimensional functions it is standard to graphically represent the principal functions by superimposing in the same plot three functions: the mean function and the mean function plus (and minus) the principal function (multiplied by a constant). • In order to do a similar graphic when dealing with bivariate density functions we need a way to represent three such functions in the same graphic. • The use of a level set for representing each of them is a simple and effective way to do it. Optimal level sets for bivariate densities 11/52 Pedro Delicado and Philippe Vieu
Introduction One density Several densities Conclusions References 1 Introduction 2 Optimal level sets for a single density Optimality based on distances between density level sets Optimality based on distances between bivariate densities Difficulties with managing several densities 3 Optimal representation of several densities by level sets Representation with a single level set Representation with more than a level set Case of estimated densities Some Monte Carlo experiments 4 Conclusions Optimal level sets for bivariate densities 12/52 Pedro Delicado and Philippe Vieu
Introduction One density Several densities Conclusions References Optimal level sets for a single density • We consider first the problem of representing only one density by some of its density level sets. • We assume that J has been fixed in advance and we want to do the best choice of α 1 , . . . , α J . • There is not a unique way for specifying what best could mean. • We examine two possibilities: 1 Choosing the J density level sets that best represent the whole family of level sets { C α : α ∈ ]0 , 1[ } in the sense that each non-plotted C α is close to the nearest level among those that are plotted: C α 1 , . . . , C α J . 2 Each collection of level sets C α 1 , . . . , C α J defines in a natural way a piecewise uniform bivariate density function. We propose to minimize in α 1 , . . . , α J the distance between this piecewise uniform density and the density that we want to represent by C α 1 , . . . , C α J . Optimal level sets for bivariate densities 13/52 Pedro Delicado and Philippe Vieu
Introduction One density Several densities Conclusions References Distances between level sets 1 Introduction 2 Optimal level sets for a single density Optimality based on distances between density level sets Optimality based on distances between bivariate densities Difficulties with managing several densities 3 Optimal representation of several densities by level sets Representation with a single level set Representation with more than a level set Case of estimated densities Some Monte Carlo experiments 4 Conclusions Optimal level sets for bivariate densities 14/52 Pedro Delicado and Philippe Vieu
Recommend
More recommend