Imprecise Compositional Data Analysis: Alternative Statistical - PowerPoint PPT Presentation

Imprecise Compositional Data Analysis: Alternative Statistical Methods Michael Smithson The Australian National University 2-6 July 2019 /SIPTA 2019 Smithson (The Australian National University) Imprecise Compositional Data Analysis SIPTA19 1 / 11

Introduction Statistical methods for analyzing imprecise compositional data Compositional data must sum to a constant value, e.g., probabilities that must sum to 1. Statistical methods for analyzing imprecise compositional data are relatively under-developed. Two alternative approaches are considered here: Log-ratio transforms (well-established, starting with Aitchison, 1982) Dirichlet regression (also well-established, including the IDM) Probability-ratio transforms (under development by the author) Smithson (The Australian National University) Imprecise Compositional Data Analysis SIPTA19 2 / 11

Introduction Compositional data Given a composition consisting of K parts, suppose that we have N collections of points in the K -simplex, 0 ≤ π ( j i ) ≤ 1 , for k = 1 , . . . , K ki and i = 1 , . . . , N , such that for each i they sum to 1 across the k . For the i th collection, there are J i points, indexed by the bracketed j i superscript. Our main topic is how to connect these collections with regression or generalized linear models (GLMs) that treat them as dependent variables. Smithson (The Australian National University) Imprecise Compositional Data Analysis SIPTA19 3 / 11

Log-Ratio Transform Method Basics The log-ratio transform method maps data from the simplex to an unrestricted vector space, via the logit transform of odds. Suppose the K th composition part is the part of the composition against which we would like to compare the other parts. Then Aitchison’s “additive log-ratio” transform would yield �� π ( j i ) �� π ( j i ) �� η ( j i ) ki Ki (1) = log , ki 1 − π ( j i ) 1 − π ( j i ) ki Ki for k = 1 , . . . , K − 1. The η ( j i ) are considered as continuous random ki variables on the real line, and therefore may be analysed with appropriate statistical methods for such variables. Smithson (The Australian National University) Imprecise Compositional Data Analysis SIPTA19 4 / 11

Log-Ratio Transform Method Advantages The log-ratio framework enjoys several attractive properties that account for its popularity. Subcompositional coherence means that the inferential outcomes of an analysis of any subcomposition should remain the same for that analysis in the entire composition. Permutation invariance guarantees that outcomes remain the same regardless of the ordering of the components in a composition. It is straightforward to use because the log-ratios can be analyzed with conventional methods such as linear regression with Gaussian errors. Smithson (The Australian National University) Imprecise Compositional Data Analysis SIPTA19 5 / 11

Log-Ratio Transform Method Disadvantages The log-ratio framework also has some limitations: It cannot deal with zeros in the data. It is unable to extend to non-Gaussian distributions without adding more parameters. Dispersion is routinely ignored in the log-ratio framework. Smithson (The Australian National University) Imprecise Compositional Data Analysis SIPTA19 6 / 11

Dirichlet Method Basics Dirichlet regression models are a natural and popular choice for modeling compositional data. These models have two main limitations. The marginal distributions are beta distributions sharing the same precision parameter, so all parts of the composition must have the same submodel for their precisions. This limits its ability to model multivariate heteroskedasticity. A single Dirichlet distribution can model only negative associations among the variables, although this restriction may be relaxed when covariates are modeled or other kinds of mixture models are employed. Smithson (The Australian National University) Imprecise Compositional Data Analysis SIPTA19 7 / 11

Probability-Ratio Method Basics Rather than taking logs of relative odds, we take the corresponding relative probabilities and model them. Turning once again to our example with the K th category as the base, the relevant probability ratios are �� ν ( j i ) = π ( j i ) π ( j i ) ki + π ( j i ) (2) , ki ki Ki for k = 1 , . . . , K − 1. The ν ( j i ) are random variables in the unit ki hypercube, and the marginals may be modeled by any distribution whose support is the unit interval (0,1). The dependency structure may be modeled using copulas. Smithson (The Australian National University) Imprecise Compositional Data Analysis SIPTA19 8 / 11

Probability-Ratio Method Advantages The advantages of the probability-ratio method are: The probablity-ratio approach includes the logistic-normal distribution but also other more flexible two-parameter distributions such as the beta and cdf-quantile family. Unlike the Dirichlet model, each marginal distribution can have a unique precision parameter, thereby able to model multivariate heteroskedasticity. Modeling dispersion is naturally done in the probability-ratio framework. It possesses both permutation invariance and subcompositional coherence. Zeros can be dealt with via hurdle models. The use of copulas enables flexible modeling of dependency structures, separately from the marginal structures. Smithson (The Australian National University) Imprecise Compositional Data Analysis SIPTA19 9 / 11

Conclusion Conclusions A new “probability ratio” approach to modeling compositional data has been proposed that can complement the well-established log-ratio approach. Both of these provide an alternative to Dirichlet models for imprecise compositional data. Much remains to be done in evaluating their merits, for instance their relative sensitivities to noise or other sources of imprecision. The probability ratio approach shows promise in overcoming some of the limitations of the other two approaches. Smithson (The Australian National University) Imprecise Compositional Data Analysis SIPTA19 10 / 11

Conclusion The End Thanks! Smithson (The Australian National University) Imprecise Compositional Data Analysis SIPTA19 11 / 11

Imprecise Compositional Data Analysis: Alternative Statistical - PowerPoint PPT Presentation

Imprecise Compositional Data Analysis: Alternative Statistical Methods Michael Smithson The Australian National University 2-6 July 2019 /SIPTA 2019 Smithson (The Australian National University) Imprecise Compositional Data Analysis SIPTA19

Imprecise Markov chains From basic theory to applications II prof. Jasper De Bock Imprecise

Imprecise Gaussian Discriminant Classification 11th International Symposium on Imprecise

Existence of Simple Tours through Imprecise Points Maarten L offler Center for Geometry,

Challenges on the use of Imprecise Prior for Imprecise Inference on Poisson Sampling Models Chel

Compositional Analysis of Compositional Analysis of Soluble Salts in Bresle Bresle Extraction

Unusual compositional dependence of the Unusual compositional dependence of the exciton reduced

A Compositional Logic A Compositional Logic for Control Flow for Control Flow Gang Tan, Boston

Bruno Gavranovi c SYCO2 Compositional Deep Learning December 18, 2018 1 / 36 Compositional

It Is Advantageous Case of a Precise Syllabus to Make a Syllabus Case of an Imprecise . . .

Compositional Recurrence Analysis Azadeh Farzan Zachary Kincaid University of Toronto September

Characterizing Uncertainty in Decision Trees through Imprecise Splitting Rules ISIPTA 2019 Malte

Imprecise Markov chains by Jasper De Bock & Thomas Krak SMPS/BELIEF 2018 September 17 now

Hitting Times and Probabilities for Imprecise Markov Chains Thomas Krak, Natan TJoens, and

Monte Carlo Estimation for Imprecise Probabilities Basic Properties Arne Decadt Gert de Cooman

Introduction to the theory of imprecise probability Erik Quaeghebeur TU Delft, the Netherlands

Imprecise probabilistic models for inference in exponential families Erik Quaeghebeur Gert de

Virginia Sheriffs Association Virginia Sheriffs Association Sheriff Thomas D. Jones,

Jelena Titko University of Economics and Culture LATVIA SPIDE project Multiplier event, December

Wickersham Elementary School School Board Presentation October 8, 2019 Wickersham Elementary

Sustainable value creation in gold mining Presentation to WCOA NICK HOLLAND November 2014

e / - Separation in AHCAL using shower shapes Justas Zalieckas AHCAL Analogue Hadronic

Event Shapes in t t and QCD Events @ LHC Using transverse, 3D Event Shapes in Multivariate

35 Re b e c c a Wo lfe & T a ra No ro nha Me rc y Co rps Unde rsta nding Whe n E mplo

Taxes and bank capital structure Glenn Schepens Ghent University and National Bank of Belgium

Sambuz

Useful Links

Newsletter

Mail Us

Imprecise Compositional Data Analysis: Alternative Statistical - PowerPoint PPT Presentation

Imprecise Compositional Data Analysis: Alternative Statistical Methods Michael Smithson The Australian National University 2-6 July 2019 /SIPTA 2019 Smithson (The Australian National University) Imprecise Compositional Data Analysis SIPTA19

Imprecise Markov chains From basic theory to applications II prof. Jasper De Bock Imprecise

Imprecise Gaussian Discriminant Classification 11th International Symposium on Imprecise

Existence of Simple Tours through Imprecise Points Maarten L offler Center for Geometry,

Challenges on the use of Imprecise Prior for Imprecise Inference on Poisson Sampling Models Chel

Compositional Analysis of Compositional Analysis of Soluble Salts in Bresle Bresle Extraction

Unusual compositional dependence of the Unusual compositional dependence of the exciton reduced

A Compositional Logic A Compositional Logic for Control Flow for Control Flow Gang Tan, Boston

Bruno Gavranovi c SYCO2 Compositional Deep Learning December 18, 2018 1 / 36 Compositional

It Is Advantageous Case of a Precise Syllabus to Make a Syllabus Case of an Imprecise . . .

Compositional Recurrence Analysis Azadeh Farzan Zachary Kincaid University of Toronto September

Characterizing Uncertainty in Decision Trees through Imprecise Splitting Rules ISIPTA 2019 Malte

Imprecise Markov chains by Jasper De Bock &amp; Thomas Krak SMPS/BELIEF 2018 September 17 now

Hitting Times and Probabilities for Imprecise Markov Chains Thomas Krak, Natan TJoens, and

Monte Carlo Estimation for Imprecise Probabilities Basic Properties Arne Decadt Gert de Cooman

Introduction to the theory of imprecise probability Erik Quaeghebeur TU Delft, the Netherlands

Imprecise probabilistic models for inference in exponential families Erik Quaeghebeur Gert de

Virginia Sheriffs Association Virginia Sheriffs Association Sheriff Thomas D. Jones,

Jelena Titko University of Economics and Culture LATVIA SPIDE project Multiplier event, December

Wickersham Elementary School School Board Presentation October 8, 2019 Wickersham Elementary

Sustainable value creation in gold mining Presentation to WCOA NICK HOLLAND November 2014

e / - Separation in AHCAL using shower shapes Justas Zalieckas AHCAL Analogue Hadronic

Event Shapes in t t and QCD Events @ LHC Using transverse, 3D Event Shapes in Multivariate

35 Re b e c c a Wo lfe &amp; T a ra No ro nha Me rc y Co rps Unde rsta nding Whe n E mplo

Taxes and bank capital structure Glenn Schepens Ghent University and National Bank of Belgium

Sambuz

Useful Links

Newsletter

Mail Us

Imprecise Markov chains by Jasper De Bock & Thomas Krak SMPS/BELIEF 2018 September 17 now

35 Re b e c c a Wo lfe & T a ra No ro nha Me rc y Co rps Unde rsta nding Whe n E mplo