Outline R Package: Libra LASSO vs. Differential Inclusions Algorithm Variable Splitting Summary Differential Inclusion Method in High Dimensional Statistics Yuan Yao HKUST July 14, 2018 Yuan Yao Differential Inclusion Method in High Dimensional Statistics
Outline R Package: Libra LASSO vs. Differential Inclusions Algorithm Variable Splitting Summary Acknowledgements • Theory • Stanley Osher , Wotao Yin (UCLA) • Feng Ruan (Stanford & PKU) • Jiechao Xiong , Chendi Huang (PKU) • Applications: • Qianqian Xu , Jiechao Xiong , Chendi Huang , Xinwei Sun (PKU) • Lingjing Hu (BCMU) • Yifei Huang , Weizhi Zhu (HKUST) • Ming Yan , Zhimin Peng (UCLA) • Grants: • National Basic Research Program of China (973 Program), NSFC Yuan Yao Differential Inclusion Method in High Dimensional Statistics
Outline R Package: Libra LASSO vs. Differential Inclusions Algorithm Variable Splitting Summary 1 R Package: Libra Examples: Linear/Logistic Regression, Ising graphical models 2 From LASSO to Differential Inclusions LASSO and Bias Differential Inclusions A Theory of Path Consistency 3 Large Scale Algorithm Linearized Bregman Iteration Generalizations 4 Variable Splitting A Weaker Irrepresentable/Incoherence Condition 5 Summary Yuan Yao Differential Inclusion Method in High Dimensional Statistics
Outline R Package: Libra LASSO vs. Differential Inclusions Algorithm Variable Splitting Summary Cran R package: Libra http://cran.r-project.org/web/packages/Libra/ Yuan Yao Differential Inclusion Method in High Dimensional Statistics
Outline R Package: Libra LASSO vs. Differential Inclusions Algorithm Variable Splitting Summary Libra (1.6) currently includes Sparse statistical models: • linear regression: ISS (differential inclusion), LB • logistic regression (binomial, multinomial): LB • graphical models (Gaussian, Ising, Potts): LB Two types of regularization: • LASSO: l 1 -norm penalty • Group LASSO: l 2 − l 1 penalty Yuan Yao Differential Inclusion Method in High Dimensional Statistics
Outline R Package: Libra LASSO vs. Differential Inclusions Algorithm Variable Splitting Summary Libra computes regularization paths via Linearized Bregman Iteration (LB) for θ 0 = z 0 = 0 and k ∈ N , n z k +1 = z k − α k � ∇ θ ℓ ( x i , θ k ) (1a) n i =1 θ k +1 = κ · prox �·� ∗ ( z k +1 ) (1b) where • ℓ ( x , θ ) is the loss function to minimize � 1 2 � u − z � 2 + � u � ∗ • prox �·� ∗ ( z ) := arg min u � • α k > 0 is step-size • κ > 0 while α k κ �∇ 2 θ ˆ E ℓ ( x , θ ) � < 2 • as simple as ISTA, easy to parallel implementation Yuan Yao Differential Inclusion Method in High Dimensional Statistics
Outline R Package: Libra LASSO vs. Differential Inclusions Algorithm Variable Splitting Summary Examples: Linear/Logistic Regression, Ising graphical models Linear Regression Linear Regression: y = X β + ǫ β is sparse or group sparse, with two types of penalty: • ”ungrouped”: � i | β i | �� • ”grouped”: � g i = g β 2 g i Yuan Yao Differential Inclusion Method in High Dimensional Statistics
Outline R Package: Libra LASSO vs. Differential Inclusions Algorithm Variable Splitting Summary Examples: Linear/Logistic Regression, Ising graphical models Linear Regression Example: Diabetes Data data(’diabetes ’) attributes (x) #$dim # [1] 442 10 #$dimnames [[2]] # [1] "age" "sex" "bmi" "map" "tc" "ldl" "hdl" "tch" "ltg" "glu" lassopath = lars(x,y) isspath = iss(x,y) lb(x,y,kappa =100 , alpha =0.005 , family="gaussian",group="ungrouped", intercept=FALSE ,normalize=FALSE) lb(x,y,kappa =500 , alpha =0.001 , family="gaussian",group="ungrouped", intercept=FALSE ,normalize=FALSE) Yuan Yao Differential Inclusion Method in High Dimensional Statistics
Outline R Package: Libra LASSO vs. Differential Inclusions Algorithm Variable Splitting Summary Examples: Linear/Logistic Regression, Ising graphical models LB generates iterative regularization paths Yuan Yao Differential Inclusion Method in High Dimensional Statistics
Outline R Package: Libra LASSO vs. Differential Inclusions Algorithm Variable Splitting Summary Examples: Linear/Logistic Regression, Ising graphical models Logistic Regression Logistic Regression: e X β log P ( y = 1 | X ) P ( y = − 1 | X ) = X β ⇔ P ( y = 1 | X ) = 1 + e X β =: σ ( X β ) β is sparse or group sparse, with two types of penalty: • ”ungrouped”: � i | β i | �� • ”grouped”: � g i = g β 2 i g Yuan Yao Differential Inclusion Method in High Dimensional Statistics
Outline R Package: Libra LASSO vs. Differential Inclusions Algorithm Variable Splitting Summary Examples: Linear/Logistic Regression, Ising graphical models Example: Publications of COPSS Award Winners • dataset is provided by Prof. Jiashun Jin @CMU • 3248 papers by 3607 authors between 2003 and the first quarter of 2012 from: • the Annals of Statistics, Journal of the American Statistical Association, Biometrika and Journal of the Royal Statistical Society Series B • a subset of 382 papers by 35 COPSS award winners • Question: can we model the coauthorship structure to predict the out-of-sample behavior? Coauthorship John.D.Storey Jianqing.Fan J.S.Rosenthal James.O.Berger Jun.S.Liu Iain.M.Johnstone Kathryn.Roeder David.L.Donoho Larry.Wasserman David.Dunson Marc.A.Suchard D.V.Hinkley Mark.J.van.der.Laan C.F.Jeff.Wu Martin.J.Wainwright Bernard.W.Silverman Michael.A.Newton Andrew.Gelman Nancy.Reid Xihong.Lin Nilanjan.Chatterjee Xiao.Li.Meng Pascal.Massart Wing.Hung.Wong Peter.Hall Tze.Leung.Lai Peter.J.Bickel T.Tony.Cai Peter.McCullagh Stephen.E.Fienberg Rafael.A.Irizarry S.C.Kou R.J.Carroll R.L.Prentice R.J.Tibshirani Yuan Yao Differential Inclusion Method in High Dimensional Statistics
Outline R Package: Libra LASSO vs. Differential Inclusions Algorithm Variable Splitting Summary Examples: Linear/Logistic Regression, Ising graphical models A logistic regression path with early stopping regularization Logistic: Peter.Hall ~. Coauthorship 1 4 13 20 25 29 34 39 44 50 56 63 71 79 87 96 Jianqing.Fan J.S.Rosenthal John.D.Storey James.O.Berger Jun.S.Liu 0.0 Iain.M.Johnstone Kathryn.Roeder David.L.Donoho Larry.Wasserman David.Dunson -0.5 Marc.A.Suchard D.V.Hinkley 6 Mark.J.van.der.Laan -1.0 C.F.Jeff.Wu Martin.J.Wainwright Coefficients Bernard.W.Silverman Michael.A.Newton -1.5 Andrew.Gelman 2 Nancy.Reid Xihong.Lin -2.0 David.Dunson Nilanjan.Chatterjee Jianqing.Fan Xiao.Li.Meng Larry.Wasserman Nilanjan.Chatterjee Pascal.Massart -2.5 Peter.J.Bickel Wing.Hung.Wong 5 Raymond.J.Carroll Peter.Hall Robert.J.Tibshirani Tze.Leung.Lai 3 T.Tony.Cai Peter.J.Bickel -3.0 T.Tony.Cai Xihong.Lin 1 Peter.McCullagh Stephen.E.Fienberg Rafael.A.Irizarry S.C.Kou 0.0 0.2 0.4 0.6 0.8 1.0 R.J.Carroll R.L.Prentice R.J.Tibshirani Solution-Path Figure: Peter Hall vs. other COPSS award winners in sparse logistic regression [papers from AoS/JASA/Biometrika/JRSSB, 2003-2012]: true coauthors are merely Tony Cai, R.J. Carroll, and J. Fan Yuan Yao Differential Inclusion Method in High Dimensional Statistics
Outline R Package: Libra LASSO vs. Differential Inclusions Algorithm Variable Splitting Summary Examples: Linear/Logistic Regression, Ising graphical models Sparse Ising Model All models are wrong, but some are useful (George Box): �� � � P ( x 1 , . . . , x p ) ∼ exp H i x i + J ij x i x j i i , j • Ising model: x i = 1 if author i appears in a paper, otherwise 0 • H i describes the mean publication rate of author i • J ij describes the interactions between author i and j • J ij > 0: author i and j collaborate more often than others • J ij < 0: author i and j collaborate less frequently than others • sparsity: J ij = 0 mostly, a model of collaboration network • learned by maximum composite conditional likelihood with LB Yuan Yao Differential Inclusion Method in High Dimensional Statistics
Outline R Package: Libra LASSO vs. Differential Inclusions Algorithm Variable Splitting Summary Examples: Linear/Logistic Regression, Ising graphical models Early stopping against overfitting in sparse Ising model learning a true Ising model of 2-D grid a movie of LB path Yuan Yao Differential Inclusion Method in High Dimensional Statistics
Outline R Package: Libra LASSO vs. Differential Inclusions Algorithm Variable Splitting Summary Examples: Linear/Logistic Regression, Ising graphical models Application: Sparse Ising Model of COPSS Award Winners Coauthorship Jianqing.Fan J.S.Rosenthal John.D.Storey James.O.Berger Jun.S.Liu Iain.M.Johnstone Kathryn.Roeder David.L.Donoho Larry.Wasserman David.Dunson Marc.A.Suchard D.V.Hinkley Mark.J.van.der.Laan C.F.Jeff.Wu Martin.J.Wainwright Bernard.W.Silverman Michael.A.Newton Andrew.Gelman Nancy.Reid Xihong.Lin Nilanjan.Chatterjee Xiao.Li.Meng Pascal.Massart Wing.Hung.Wong Peter.Hall Tze.Leung.Lai Peter.J.Bickel T.Tony.Cai Peter.McCullagh Stephen.E.Fienberg Rafael.A.Irizarry S.C.Kou R.J.Carroll R.L.Prentice R.J.Tibshirani Figure: Left: LB path of Ising Model learning; Right: coauthorship network of existing data. Typically COPSS winners do not like working together; Peter Hall (1951-2016) is the hub of statisticians, like Erd¨ os for mathematicians Yuan Yao Differential Inclusion Method in High Dimensional Statistics
Recommend
More recommend