Dimension Reduction and High-Dimensional Data Estimation and - PowerPoint PPT Presentation

Dimension Reduction and High-Dimensional Data Estimation and Inference with Application to Genomics and Neuroimaging Maxime Turgeon April 9, 2019 McGill University Department of Epidemiology, Biostatistics, and Occupational Health 1/21

Introduction ❼ Data revolution fueled by technological developments, era of “big data”. ❼ In genomics and neuroimaging, high-throughput technologies lead to high-dimensional data . ❼ High costs lead to small-to-moderate samples size. ❼ More features than samples (large p , small n ) 2/21

Omnibus Hypotheses and Dimension Reduction ❼ Traditionally, analysis performed one feature at a time . ❼ Large computational burden ❼ Conservative tests and low power ❼ Ignore correlation between features ❼ From a biological standpoint, there are natural groupings of measurements ❼ Key : Summarise group-wise information using latent features ❼ Dimension Reduction 3/21

High-dimensional data–Estimation ❼ Several approaches use regularization ❼ Zou et al. (2006) Sparse PCA ❼ Witten et al. (2009) Penalized Matrix Decomposition ❼ Other approaches use structured estimators ❼ Bickel & Levina (2008) Banded and thresholded covariance estimators ❼ All of these approaches require tuning parameters, which increases computational burden 4/21

High-dimensional data–Inference ❼ Double Wishart problem and largest root ❼ Distribution of largest root is difficult to compute ❼ Several approximation strategies presented ❼ Chiani found simple recursive equations, but computationally unstable ❼ Result of Johnstone gives an excellent good approximation ❼ Does not work with high-dimensional data 5/21

Contribution of the thesis In this thesis, I address the limitations outlined above. ❼ Block-independence leads to simple approach free of tuning parameters ❼ Empirical estimator that extends Johnstone’s theorem to high-dimensional data ❼ Application of these ideas to sequencing study of DNA methylation and ACPA levels. 6/21

First Manuscript–Estimation

Principal Component of Explained Variance Let Y be a multivariate outcome of dimension p and X , a vector of covariates. We assume a linear relationship: Y = β T X + ε. The total variance of the outcome can then be decomposed as Var( Y ) = Var( β T X ) + Var( ε ) = V M + V R . 7/21

PCEV: Statistical Model Decompose the total variance of Y into: 1. Variance explained by the covariates; 2. Residual variance. 8/21

PCEV: Statistical Model The PCEV framework seeks a linear combination w T Y such that the proportion of variance explained by X is maximised: w T V M w R 2 ( w ) = w T ( V M + V R ) w . Maximisation using a combination of Lagrange multipliers and linear algebra. Key observation : R 2 ( w ) measures the strength of the association 9/21

Block-diagonal Estimator I propose a block approach to the computation of PCEV in the presence of high-dimensional outcomes. ❼ Suppose the outcome variables Y can be divided in blocks of variables in such a way that ❼ Variables within blocks are correlated ❼ Variables between blocks are uncorrelated   0 0 ∗ Cov( Y ) =  0 0  ∗   0 0 ∗ 10/21

Block-diagonal Estimator ❼ We can perform PCEV on each of these blocks, resulting in a component for each block. ❼ Treating all these “partial” PCEVs as a new, multivariate pseudo-outcome, we can perform PCEV again; the result is a linear combination of the original outcome variables. ❼ Mathematically equivalent to performing PCEV in a single-step (under assumption) ❼ Extensive simulation study shows good power and robustness of inference to violations of assumption. ❼ Presented application to genomics and neuroimaging data. 11/21

Second Manuscript–Inference

Double Wishart Problem ❼ Recall that PCEV is maximising a Rayleigh quotient: w T V M w R 2 ( w ) = w T ( V M + V R ) w . ❼ Equivalent to finding largest root λ of a double Wishart problem : det ( A − λ ( A + B )) = 0 , where A = V M , B = V R . 12/21

Inference ❼ Evidence in the literature that the null distribution of the largest root λ should be related to the Tracy-Widom distribution . ❼ Result of Johnstone (2008) gives an excellent approximation to the distribution using an explicit location-scale family of the TW(1). 13/21

Inference ❼ However, Johnstone’s theorem requires a rank condition on the matrices (rarely satisfied in high dimensions). ❼ The null distribution of λ is asymptotically equal to that of the largest root of a scaled Wishart (Srivastava). ❼ The null distribution of the largest root of a Wishart is also related to the Tracy-Widom distribution. ❼ More generally, random matrix theory suggests that the Tracy-widom distribution is key in central-limit-like theorems for random matrices. 14/21

Empirical Estimate I proposed to obtain an empirical estimate as follows: Estimate the null distribution 1. Perform a small number of permutations ( ∼ 50) on the rows of Y ; 2. For each permutation, compute the largest root statistic. 3. Fit a location-scale variant of the Tracy-Widom distribution. Numerical investigations support this approach for computing p-values. The main advantage over a traditional permutation strategy is the computation time . 15/21

Third Manuscript–Application

Data ❼ Anti-citrullinated Protein Antibody (ACPA) levels were measured in 129 levels without any symptom of Rheumatoid Arthritis (RA). ❼ DNA methylation levels were measured from whole-blood samples using a targeted sequencing technique ❼ CpG dinucleotides were grouped in regions of interest before the sequencing ❼ We have 23,350 regions to analyze individually, corresponding to multivariate datasets Y k , k = 1 , . . . , 23 , 350. 16/21

Method ❼ PCEV was performed independently on all regions. ❼ Significant amount of missing data; complete-case analysis. ❼ Analysis was adjusted for age, sex, and smoking status. ❼ ACPA levels are dichotomized into high and low. ❼ For the 2519 regions with more CpGs than observations, we used the Tracy-Widom empirical estimator to obtain p-values. 17/21

Results ❼ There were 1062 statistically significant regions at the α = 0 . 05 level. ❼ Univariate analysis of 175,300 CpG dinucleotides yielded 42 significant results ❼ These 42 CpG dinucleotides were in 5 distinct regions. 18/21

Discussion

Summary ❼ This thesis described specific approaches to dimension reduction with high-dimensional datasets. ❼ Manuscript 1 : Block-independence assumption leads to convenient estimation strategy that is free of tuning parameters. ❼ Manuscript 2 : Empirical estimator provides valid p-values for high-dimensional data by leveraging Johnstone’s theorem. ❼ Manuscript 3 : Application of this thesis’ ideas to a study of the association between aCPA levels and DNA methylation. ❼ All methods from Manuscripts 1 & 2 are part of the R package pcev . 19/21

Limitations ❼ Inference for PCEV-block is robust to block-independence violations, but not estimation ❼ Could have impact on downstream analyses. ❼ Empirical estimator does not address limitations due to power ❼ But combining with shrinkage estimator should improve power. ❼ Missing data and multivariate analysis 20/21

Future Work ❼ Estimate effective number of independent tests in region-based analyses ❼ Multiple imputation and PCEV ❼ Nonlinear dimension reduction 21/21

Thank you The slides can be found at maxturgeon.ca/talks . 21/21

Dimension Reduction and High-Dimensional Data Estimation and - PowerPoint PPT Presentation

Dimension Reduction and High-Dimensional Data Estimation and Inference with Application to Genomics and Neuroimaging Maxime Turgeon April 9, 2019 McGill University Department of Epidemiology, Biostatistics, and Occupational Health 1/21

Reduced-Rank Singular Value Decomposition for Dimension Reduction with High-Dimensional Data

What can we say about high- dimensional objects from a low-dimensional representation? 2

Dimension Reduction and Nearest Neighbor Search Advanced Algorithms Nanjing University, Fall

Dimensionality Reduction; PCA & SVD Kalev Kask Motivation High-dimensional data

Dimension Reduction CSE 6242 / CX 4242 Thanks : Prof. Jaegul Choo , Dr. Ramakrishnan Kannan,

A review of dimensionality reduction in high-dimensional data using multi-core and many-core

Dimensionality Reduction for Visualization Lecture 13 April 8, 2020 Outline High-dimensional

1 Three Dimensional Aggregation (con.t) Three Dimensional Aggregation (con.t) If we need to

High Dimensional Data Alark Joshi High dimensional data Data with multiple dimensions,

High Dimensional Data, Covariance Matrices High Dimensional Data Examples and Application to

Linear Dimension Reduction (in L 2 ) Linear Dimension Reduction: R D R d Goal: Find a low-dim.

Dimension Reduction CS 6242 Ramakrishnan Kannan Thanks : Prof. Jaegul Choo and Prof. Le

Visualization ( Nonlinear dimensionality reduction ) Fei Sha Yahoo! Research

Statistics for High-Dimensional Data: Selected Topics Peter B uhlmann Seminar f ur

Dimension Reduction CS 760@UW-Madison Goals for the lecture you should understand the following

Search in High-Dimensional spaces and Dimensionality Reduction i i li d i D. Gunopulos 1

CSE 158 Lecture 5 Web Mining and Recommender Systems Dimensionality Reduction This week How

CSE 158 Lecture 5 Web Mining and Recommender Systems Dimensionality Reduction This week How

Discriminative Feature Extraction and Dimension Reduction - PCA & LDA Berlin Chen, 2004

Nonlinear Dimensionality Visualize all dimensions Visualize the intrinsic low-dimensional

The Beginner's Guide to Dimensionality Reduction Explore the methods that data scientists use to

Statistics for high-dimensional data: p-values and confidence intervals Peter B uhlmann

STAT 209 Dimensionality Reduction November 26, 2019 Colin Reimer Dawson 1 / 24 Dimensionality

Lecture 13: Even more dimension reduction techniques Felix Held, Mathematical Sciences

Dimension Reduction and High-Dimensional Data Estimation and - PowerPoint PPT Presentation

Dimension Reduction and High-Dimensional Data Estimation and Inference with Application to Genomics and Neuroimaging Maxime Turgeon April 9, 2019 McGill University Department of Epidemiology, Biostatistics, and Occupational Health 1/21

Reduced-Rank Singular Value Decomposition for Dimension Reduction with High-Dimensional Data

What can we say about high- dimensional objects from a low-dimensional representation? 2

Dimension Reduction and Nearest Neighbor Search Advanced Algorithms Nanjing University, Fall

Dimensionality Reduction; PCA &amp; SVD Kalev Kask Motivation High-dimensional data

Dimension Reduction CSE 6242 / CX 4242 Thanks : Prof. Jaegul Choo , Dr. Ramakrishnan Kannan,

A review of dimensionality reduction in high-dimensional data using multi-core and many-core

Dimensionality Reduction for Visualization Lecture 13 April 8, 2020 Outline High-dimensional

1 Three Dimensional Aggregation (con.t) Three Dimensional Aggregation (con.t) If we need to

High Dimensional Data Alark Joshi High dimensional data Data with multiple dimensions,

High Dimensional Data, Covariance Matrices High Dimensional Data Examples and Application to

Linear Dimension Reduction (in L 2 ) Linear Dimension Reduction: R D R d Goal: Find a low-dim.

Dimension Reduction CS 6242 Ramakrishnan Kannan Thanks : Prof. Jaegul Choo and Prof. Le

Visualization ( Nonlinear dimensionality reduction ) Fei Sha Yahoo! Research

Statistics for High-Dimensional Data: Selected Topics Peter B uhlmann Seminar f ur

Dimension Reduction CS 760@UW-Madison Goals for the lecture you should understand the following

Search in High-Dimensional spaces and Dimensionality Reduction i i li d i D. Gunopulos 1

CSE 158 Lecture 5 Web Mining and Recommender Systems Dimensionality Reduction This week How

CSE 158 Lecture 5 Web Mining and Recommender Systems Dimensionality Reduction This week How

Discriminative Feature Extraction and Dimension Reduction - PCA &amp; LDA Berlin Chen, 2004

Nonlinear Dimensionality Visualize all dimensions Visualize the intrinsic low-dimensional

The Beginner's Guide to Dimensionality Reduction Explore the methods that data scientists use to

Statistics for high-dimensional data: p-values and confidence intervals Peter B uhlmann

STAT 209 Dimensionality Reduction November 26, 2019 Colin Reimer Dawson 1 / 24 Dimensionality

Lecture 13: Even more dimension reduction techniques Felix Held, Mathematical Sciences

Dimensionality Reduction; PCA & SVD Kalev Kask Motivation High-dimensional data

Discriminative Feature Extraction and Dimension Reduction - PCA & LDA Berlin Chen, 2004