Analysis of sorting data using multiple correspondence analysis and - PowerPoint PPT Presentation

Analysis of sorting data using multiple correspondence analysis and a related method E.M. Qannari Ph. Courcoux V. Cariou ONIRIS, Nantes, F-44322, France 1

Sorting data : Procedure n stimuli evaluated by m subjects: “ Please, sort the stimuli in as many groups as you consider necessary with the understanding that stimuli in the same group are perceived as similar ” Acid Salty Salty Fresh Sweet Bitter Subject 1 Subject 2 Subject m 2

General setting and notations K m groups K 2 groups K j group K 1 group indicators indicators indicators indicators n X 1 X 2 X m X j m categorical variables (represented by their indicator variables) 3

Beer data Data from Abdi H., Chollet S., Valentin D. and Chréa C. (2007) Analysing assessors and products in sorting tasks: DISTATIS,theory and applications. Food Quality and Preference. 4

Data from Abdi et al. (2007) • The data relate to an experiment where ten consumers were instructed to sort eight commercial beers. # Beer Subj1 Subj2 Subj3 Subj4 Subj5 Subj6 Subj7 Subj8 Subj9 Subj10 1 Affligen 1 4 3 4 1 1 2 2 1 3 2 Budweiser 4 5 2 5 2 3 1 1 4 3 3 BucklerBlonde 3 1 2 3 2 4 3 1 1 2 4 Killian 4 2 3 3 1 1 1 2 1 4 5 StLandelin 1 5 3 5 2 1 1 2 1 3 6 BucklerHighland 2 3 1 1 3 5 4 4 3 1 7 FruitDefendu 1 4 3 4 1 1 2 2 2 4 8 EKU28 5 2 4 2 4 2 5 3 4 5 5

Discrimination indices and MCA • Given a (quantitative) variable z and let’s consider (categorical) variable X j :  2 (z/j) : discrimination index : the between groups to total variance ratio associated with z and X j . • We seek z so as to maximize : m    2 I ( z ) ( z / j )  j 1 • It is know that this problem leads to MCA • Subsequent z variables (factors) are sought following the same strategy, under orthogonality constraints. 6

Standardized MCA • Alternatively: m 1    2 I ( z ) ( z / j ) K  j 1 j 7

MCA applied to beer data Reprsentation of the beers axes 3&4 Reprsentation of the beers axes 1&2 Buckler Blonde EKU28 0.4 Fruit Defendu 0.8 0.2 Affligen 0.6 EKU28 Killian 0.0 Buckler Highland axis 2 0.4 axe 4 -0.2 0.2 St Landelin -0.4 0.0 Buckler Highland Budweise r Killian -0.6 St Landelin Buckler Blonde -0.2 Affligen Budweiser Fruit Defendu 0.0 0.2 0.4 0.6 0.8 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 axis 1 axe 3 8

Alternative method: maximizing the between groups variances • X=[X 1 , X 2 , …, X m ] (the indicator variables supposed to be centered) • Let z=Xu and denote by B(z/j) the between groups variance of z with respect to X j . • We define the total between groups variance as: m   B ( z ) B ( z / j )  j 1 9

An alternative method to MCA • We can show that the vector of loadings u is an eigenvector of the matrix (associated with the largest eigenvalue).     m     1  T T T T X X X X X X X PX   j j j j    j 1   m   1  T T with P X X X X j j j j  j 1 • Subsequent z variables can be sought following the same strategy, under orthogonality constraints. 10

The rationale behind the method of analysis • In addition to investigating the relationships between the categorical variables, we take account of the variances of the indicator variables. • VAR(Indicator)=p*(1-p) Variance of an indicator variable 0.25 0.20 p(1-p) 0.15 0.10 0.05 Presence of Presence of 0.00 rare categories rare categories 0.0 0.2 0.4 0.6 0.8 1.0 p 11

Alternative method applied to beer data Representation of the beers axes 1&2 Representation of the beers axes 3&4 Buckler Highland 2.0 Buckler Blonde 1.5 1.5 Fruit Defendu 1.0 Fruit Defendu 1.0 0.5 EKU28 Affligen Affligen axis 2 0.5 axis 4 Killian 0.0 Killian Buckler Highland 0.0 -0.5 St Landelin -0.5 -1.0 St Landelin -1.0 Buckler Blonde -1.5 EKU28 Budweiser Budweiser -1.5 -1 0 1 2 -2 -1 0 1 axis 1 axis 3 12

A continuum approach • MCA z=Xu with u eigenvetor of :  T 1 T ( X X ) X PX • Alternative method z=Xu with u eigenvetor of : X T PX • Regularized MCA: z=Xu with u eigenvetor of :    1     T T 1 X X I X PX 13

continuum approach and Ridge Regularization The eigenvectors of :    1     T T 1 X X I X PX are also eigenvectors of :  1  T T X X kI X PX Ridge regularization   with k     1 14

RMCA (lambda=0.95) Représentation des produits axes 1&2 Représentation des produits axes 3&4 EKU28 Buckler Blonde 1.5 2 1.0 Fruit Defendu 1 Budweiser 0.5 Affligen Buckler Blonde Killian EKU28 axe 2 axe 4 0.0 Buckler Highland Killian 0 St Landelin -0.5 Affligen -1.0 St Landelin -1 Fruit Defendu Buckler Highland -1.5 Budweiser -1 0 1 2 -2 -1 0 1 axe 1 axe 3 15

Property 1 illustrated on beer data The variance of z increases with  Alternative MCA 16

Property 2 illustrated on beer data The between groups variance of z increases with  Alternative MCA 17

Property 3 illustrated on beer data The discrimination index (between to total variance ratio) of z decreases with  0.0 0.2 0.4 0.6 0.8 1.0 Alternative lambda MCA 18

Conclusion • Proposition of an alternative method that handles the problem of rare categories • Further research work is needed to investigate this alternative method. • Proposition of a continuum approach whose end points are MCA and the alternative method. • This approach enjoys interesting properties and can easily be extended to the framework of Generalized Canonical Correlation Analysis. • See how it relates to Regularized MC by Takane and Hwang. 19

TRUGAREZ! 20

Co-occurrence matrix B e e r s 1 2 3 4 5 6 7 8 1 10 1 1 5 6 0 8 0 2 1 10 3 2 5 0 0 1 3 1 3 10 2 2 0 0 0 B e e r s 4 5 2 2 10 5 0 5 1 5 6 5 2 5 10 0 4 0 0 0 0 0 0 10 0 0 6 7 8 0 0 5 4 0 10 0 8 0 1 0 1 0 0 0 10 21

Analysis of sorting data using multiple correspondence analysis and - PowerPoint PPT Presentation

Analysis of sorting data using multiple correspondence analysis and a related method E.M. Qannari Ph. Courcoux V. Cariou ONIRIS, Nantes, F-44322, France 1 Sorting data : Procedure n stimuli evaluated by m subjects: Please, sort the

Sorting Insertion sort Bubble sort Divide and conquer sorting Sorting Last time: introduction

SORTING Review of Sorting Merge Sort Sets sorting 1 Sorting Algorithms

Overview/Questions What is sorting? Why does sorting matter? How is sorting

Sorting Lower Bound Sorting Lower Bound 1 Comparison-Based Sorting (10.4) Many sorting

Sorting Sorting: to arrange data in some sequential order Sorting occurs as a part in

Sorting with Pop Stacks Stack sorting Pop stack sorting 1-pop-stack sortability 2-pop-stack

Sorting Sorting used as a step in many algorithms Savitch Chapter 7.4 Sorting algorithms

Sorting Sorting as a tool Sorting problem: Given a list a with n elements possessing a There are

Chapter 7 External Sorting Sorting Tables Larger Than Main Memory Query Processing Sorting

Sorting Algorithms Introduction Sorting Problem Sorting Problem Given a sequence A = a 1 , .

Sorting Algorithms CENG 707 Data Structures and Algorithms Sorting Sorting is a process

Chapter 10 Sorting and Searching Some concepts Sorting is one of the most common

Sorting Algorithms October 18, 2017 CMPE 250 Sorting Algorithms October 18, 2017 1 / 74

Sorting a List: bubble sort selection sort insertion sort Sept. 22, 2017 1 Sorting BEFORE

Sorting in Linear Time Pedro Ribeiro DCC/FCUP 2018/2019 Pedro Ribeiro (DCC/FCUP) Sorting in

Cache and TLB-aware Parallel Sorting Kynan Shook Sorting Sorting is used in many places

BRUTALISM Web Design Hey, thats how I write web sites! Checklist My CMS HTML, CSS mostly

Tax Practitioner Event Hosted by B Square Financial - 20 September 2017 WELCOME QUOTE FOR

GRAVITATIONAL WAVES FROM NS INTERIORS C. Peralta, M. Bennett, M. Giacobello, A. Melatos, A. Ooi,

Ground Truth, Machine Learning, and the Mechanical Turk Bob Carpenter (w. Emily Jamison, Breck

Acts Series Lesson #140 February 4, 2014 Dean Bible Ministries www.deanbible.org Dr. Robert L.

Security: False Utopic Dreams vs. The Faithful God Lam. 3:21, This I recall to my mind,

Lynn Silver, MD, MPH Alisa Padon, PhD Senior Advisor Research Scientist Public Health Institute

CS411 Database Systems Foreign Keys Local and Global Constraints 06: SQL Triggers Kazuhiro

Sambuz

Useful Links

Newsletter

Mail Us