The Power STATIS-ACT method Jacques B enass eni, Mohammed Bennani - PowerPoint PPT Presentation

Correspondence Analysis and Related Methods - CARME 2011 The Power STATIS-ACT method Jacques B´ enass´ eni, Mohammed Bennani Dosse Universit´ e Rennes 2, UMR 6620 February 2011, Rennes, France

Contents 1. Data 2. STATIS-ACT method 3. A s -power based Criterion 4. The special case s = 1 5. The general case s > 1 6. Low rank compromises 7. Comparisons

Data ◮ K data matrices X 1 , . . . , X K ◮ Each X k is a n × p k matrix : measurements of the same units on p k different variables. ◮ D = diag( π 1 , . . . , π n ) : diagonal matrix of weights. ◮ Q k : positive definite matrices (metrics). ◮ We assume that D = I n , Q k = I p k and X k centered.

STATIS-ACT method ◮ The STATIS-ACT method is a generalization of principal component analysis used to study several data tables measured on the same observation units (or variables). ◮ The goal of this method is to analyze the relationship between these data tables (Interstructure step) and to combine them into a common structure, called a compromise . ◮ The principal components derived from the compromise solution are analyzed together with the original variables (intrastructure step).

STATIS-ACT method ◮ In STATIS, the individual association matrices W k = X k X ′ k , k = 1 , . . . , K play a central role. ◮ W k contains all the information about the multidimensional structure in the data matrix X k . ◮ The use of association matrices W k instead of X k leads to simplification of computations as it obviates the determination of rotations (GPA, Gower 1975).

STATIS-ACT method ◮ Basic idea of STATIS : derive an optimal set of weights α k for computing a compromise solution K � � W = α k W k k =1 ◮ where the α k , k = 1 , . . . , K maximize the criterion � � 2 K � trace( � WW k ) k =1 ◮ subject to the constraints α k � 0 and K � α 2 k = 1 . k =1

STATIS-ACT method ◮ Define the matrix C = ( C k ℓ ) where p k p ℓ � � 2 � � C k ℓ = trace( W k W ℓ ) = cov( x ki , x ℓ j ) i =1 j =1 ◮ Solution of STATIS : the vector a = ( α 1 , . . . , α K ) ′ is the the eigenvector of C corresponding to the largest eigenvalue of this matrix. ◮ Since C � 0 , the vecteur a can be choosen with all its elements nonnegative (Perron-Frobenius theorem).

A s -power Criterion ◮ Since trace( � WW k ) � 0 there is no specific reason for considering in STATIS-ACT the criterion � � 2 K � trace( � WW k ) k =1 rather than any other power s based criterion ( s � 1) : � � s K � trace( � WW k ) k =1 ◮ We investigate the effect of varying the power s on the optimal weights in the compromise solution.

The special case s = 1 ◮ The most simple choice for s . ◮ Simple solution and straighforward interpretation. ◮ It is possible to find a solution to ”the problem of the low rank compromise” when considering a criterion based on s = 1. ◮ Close analogy between the compromise obtained from the s = 1 criterion and the first principal component derived a PCA.

The special case s = 1 ◮ Maximize the criterion � � K � trace( � WW k ) k =1 subject to the constraints α k � 0 and K � α 2 k = 1 . k =1 ◮ The solution is given by Ce � Ce � where e = (1 , . . . , 1) ′ . a = ◮ Clearly a � 0 .

The special case s = 1 ◮ What happens if the constraint on α k is changed to � K k =1 α k = 1 ? ◮ Standard linear program where the constraint set is a polyhedron (Simplex) ◮ The optimal compromise solution is one of the initial matrices W k !

The general case s > 1 ◮ Motivation : One of the most interesting feature of the classical STATIS-ACT method based on power s = 2 is that the corresponding weights α k represent principal agreement between the given tables. ◮ A data table which is not in agreement with the others has a low weight. ◮ What happens if s > 1 ?

The general case s > 1 ◮ Maximizing K � � s � trace( � WW k ) k =1 ◮ subject to � K k =1 α 2 k = 1 and α k � 0 ◮ is equivalent to maximizing � � s K � f ( a ) = Ca k k =1 subject to a ′ a = 1. ◮ f is convex and differentiable function on R K + .

The general case s > 1 ◮ Iterative solution. ◮ Algorithm: ◮ Choose a (0) (randomly such that � a (0) � = 1). ν = 0 ◮ Repeat until convergence ◮ ν := ν + 1. � Ca ( ν ) � s − 1 ◮ Calculate z = ( z 1 , . . . , z K ) ′ where z k = . k ◮ set a ( ν +1) = Cz � Cz � ◮ End.

The general case s > 1 ◮ We prove monotone convegence. ◮ When s = 2, this algorithm is simply the power method used in the numerical calculation of the dominant eigenvector of C . ◮ Convergence to a global maximum is not necessarily guaranteed (multistart, ...) ◮ Algebraic solution when s tends to infinity.

Low rank compromises ◮ The configuration of observations given by the compromise solution is derived from principal components which are the eigenvectors of � W . ◮ In practice, interest mainly focuses on graphical representations based on the first R principal components (with R = 2 in most situations). ◮ However if � W corresponds to the maximum of the criterion of interest (with general power s ), this point is no longer true when considering the approximation of rank R of � W . ◮ For the s = 1 case, we can derive an algebraic solution to the low rank compromise of the form � R ℓ =1 u ℓ u ′ ℓ with u ℓ u ′ j = 0.

Applications ◮ Real data sets (from sensory analysis, ecology) ◮ Comparison of the weights EN (1) EN (5) EN ( ∞ ) IN (1) IN (5) IN ( ∞ ) Data set 1 0.002 0.006 0.074 0.002 0.005 0.049 2 0.026 0.053 0.084 0.023 0.049 0.087 3 0.008 0.026 0.133 0.005 0.017 0.081 4 0.015 0.054 0.238 0.012 0.046 0.167 5 0.087 0.170 0.292 0.052 0.079 0.139 6 0.024 0.045 0.182 0.015 0.025 0.111 7 0.212 0.302 0.474 0.185 0.235 0.284 Table: Comparison of a (2) and a ( s ) for s = 1 , 5 , ∞ . where EN ( s ) = � a (2) − a ( s ) � 2 and IN ( s ) = � a (2) − a ( s ) � ∞ .

Applications ◮ Monte-Carlo simulations ◮ Comparison of the weights a ( s ) s = 1 s = 2 s = 5 s = ∞ α ( s ) 0.489 0.386 0.239 0.223 1 (0.031) (0.092) (0.124) (0.108) α ( s ) 0.622 0.654 0.684 0.693 2 (0.023) (0.034) (0.033) (0.043) α ( s ) 0.610 0.643 0.676 0.674 3 (0.024) (0.036) (0.036) (0.046) Table: Comparison of a ( s ) for s = 1 , 2 , 5 , ∞ .

Conclusions ◮ The weights attached to the compromise solution for s = 1 are in general fairly close to those obtained in the usual STATIS-ACT method. ◮ The compromise obtained with the s = 1 approach simply requires elementary operations whereas the usual compromise needs calculation of the dominant eigenvector of C . ◮ The power parameter s is is relation with robustness of the compromise solution. ◮ When there are some ”outlying” matrices W k , increasing the power parameter s in the generalized criterion downloads the influence of these matrices on the compromise, thus enhancing the well known ”majority effect” of the STATIS method.

References ◮ B´ enass´ eni, J. & Bennani Dosse, M., 2010. Analyzing multiset data by the Power STATIS-ACT method, Advances in Data Analysis and Classification, to appear. ◮ Lavit, C., Escoufier, Y., Sabatier, R. & Traissac, P., 1994. The ACT (STATIS method). Computational Statistics & Data Analysis, 18, 97-117. ◮ Lavit, C., 1985. Application de la m´ ethode STATIS. Statistique et Analyse des donn´ ees, 10(1), 103-116. ◮ Gower, J.C., 1975. Generalised Procrustes Analysis. Psychometrika, 40, 33-51.

Thank you for your attention.

The Power STATIS-ACT method Jacques B enass eni, Mohammed Bennani - PowerPoint PPT Presentation

Correspondence Analysis and Related Methods - CARME 2011 The Power STATIS-ACT method Jacques B enass eni, Mohammed Bennani Dosse Universit e Rennes 2, UMR 6620 February 2011, Rennes, France Contents 1. Data 2. STATIS-ACT method 3. A

(power x 0) == 1 (power x (+ n 1)) == (* (power x n) x) (power x 0) == 1 (power x (+ (* 2 m)

The Scientific Method The Scientific Method The Scientific Method involves 6 steps: Problem

STATIS ISTA A ME MEDI DIAK AKIT IT WEBSITE & NEWSLETTER Statista The Statistics Portal

Philippine S e Stati tatisti tics Auth uthor ority ty Provin incia ial St Statis istic

Knowing and Improving Gabriella Razzano, 2020 Open data Enhanced transparency Statis

NTM TM statis istic ical l analy lysis is TRAINS The global database on Non-Tariff Measures

PRO PROJEC ECTIO IONS OF ED EDUCATION S STATIS ISTIC ICS: PRESENTA NTATI TION AND AND MET

BI BIG PI PICTURE CTURE IN N STATIS TISTI TICA CAL FRAME AME A DATA VI VISUALIZA

Dallas Dallas Fo Fort Wo Worth Fe Federal St Statis istic tical Re Research Da Data Cen

WALES SOFT POWER BAROMETER 2018 Measuring soft power beyond the nation-state April 2018 01 WHAT

Method Handles Everywhere! Charles Oliver Nutter @headius Method Handles What are method

B Method Proof assistants May 16, 2017 Lucas Franceschino What is B method? B-method goal

Newtons method Newtons method 1 / 8 Newtons method Objective: solving a non-linear

Hydro Power Generation e-Power CLA-VAL Europe Product Range e-Power IP e-Power HP e-Power MP

THE POWER OF US THE POWER OF US FIRST NATIONAL WEBINAR September 12, 2017 WEBINAR AGENDA

How does the power industry support How does the power industry support How does the power

Less Noisy Domination by Symmetric Channels Anuran Makur and Yury Polyanskiy EECS Department,

A linear operator-theoretic approach to nonlinear systems Alexandre Mauroy University of Namur

Toward Uniform Random Generation in 1-safe Petri Nets Yi-Ting Chen (LIP6 / Sorbonne Universit)

Intersections of Multiplicative Translates of 3 -Adic Cantor Sets Will Abram and Je ff Lagarias

Explicit R enyi Entropy for Hidden Markov Chains Joachim Breitner, Maciej Skorski ISIT, June

Graph Clustering Graph Clustering What is clustering? What is clustering? Finding patterns

Towers of function fields over finite fields and their sequences of zeta functions Alexey Zaytsev

MillWheel: Fault Tolerant Stream Processing at Internet Scale Presented by Rui Zhang October

Sambuz

Useful Links

Newsletter

Mail Us

The Power STATIS-ACT method Jacques B enass eni, Mohammed Bennani - PowerPoint PPT Presentation

Correspondence Analysis and Related Methods - CARME 2011 The Power STATIS-ACT method Jacques B enass eni, Mohammed Bennani Dosse Universit e Rennes 2, UMR 6620 February 2011, Rennes, France Contents 1. Data 2. STATIS-ACT method 3. A

(power x 0) == 1 (power x (+ n 1)) == (* (power x n) x) (power x 0) == 1 (power x (+ (* 2 m)

The Scientific Method The Scientific Method The Scientific Method involves 6 steps: Problem

STATIS ISTA A ME MEDI DIAK AKIT IT WEBSITE &amp; NEWSLETTER Statista The Statistics Portal

Philippine S e Stati tatisti tics Auth uthor ority ty Provin incia ial St Statis istic

Knowing and Improving Gabriella Razzano, 2020 Open data Enhanced transparency Statis

NTM TM statis istic ical l analy lysis is TRAINS The global database on Non-Tariff Measures

PRO PROJEC ECTIO IONS OF ED EDUCATION S STATIS ISTIC ICS: PRESENTA NTATI TION AND AND MET

BI BIG PI PICTURE CTURE IN N STATIS TISTI TICA CAL FRAME AME A DATA VI VISUALIZA

Dallas Dallas Fo Fort Wo Worth Fe Federal St Statis istic tical Re Research Da Data Cen

WALES SOFT POWER BAROMETER 2018 Measuring soft power beyond the nation-state April 2018 01 WHAT

Method Handles Everywhere! Charles Oliver Nutter @headius Method Handles What are method

B Method Proof assistants May 16, 2017 Lucas Franceschino What is B method? B-method goal

Newtons method Newtons method 1 / 8 Newtons method Objective: solving a non-linear

Hydro Power Generation e-Power CLA-VAL Europe Product Range e-Power IP e-Power HP e-Power MP

THE POWER OF US THE POWER OF US FIRST NATIONAL WEBINAR September 12, 2017 WEBINAR AGENDA

How does the power industry support How does the power industry support How does the power

Less Noisy Domination by Symmetric Channels Anuran Makur and Yury Polyanskiy EECS Department,

A linear operator-theoretic approach to nonlinear systems Alexandre Mauroy University of Namur

Toward Uniform Random Generation in 1-safe Petri Nets Yi-Ting Chen (LIP6 / Sorbonne Universit)

Intersections of Multiplicative Translates of 3 -Adic Cantor Sets Will Abram and Je ff Lagarias

Explicit R enyi Entropy for Hidden Markov Chains Joachim Breitner, Maciej Skorski ISIT, June

Graph Clustering Graph Clustering What is clustering? What is clustering? Finding patterns

Towers of function fields over finite fields and their sequences of zeta functions Alexey Zaytsev

MillWheel: Fault Tolerant Stream Processing at Internet Scale Presented by Rui Zhang October

Sambuz

Useful Links

Newsletter

Mail Us

STATIS ISTA A ME MEDI DIAK AKIT IT WEBSITE & NEWSLETTER Statista The Statistics Portal