Latent and Network Models with Applications to Finance Jingchen Liu - PowerPoint PPT Presentation

Latent and Network Models with Applications to Finance Jingchen Liu Department of Statistics Columbia University Joint work with Yunxiao Chen, Xiaoou Li, and Zhiliang Ying At ISFA-Columbia Workshop, June 28, 2016 November 16, 2015 1 / 44

Modeling multivariate distribution ◮ Multivariate random vector: ( R 1 , ..., R J ) ◮ Continuous vectors: multivariate Gaussian, multivariate t -distribution... ◮ Categorical vectors: loglinear model... ◮ Copula ◮ Regression 2 / 44

Latent variable modeling ◮ There exists α such that f ( R 1 , ..., R J | α ) is simple. ◮ What is considered as simple? ◮ Independence, small variance,... 3 / 44

Graphical representation 4 / 44

Local independence � f ( R 1 , ..., R J | α ) = f ( R j | α ) j 5 / 44

Applications ◮ Finance, political sciences ◮ Education ◮ Psychiatry/psychology ◮ Marketing and e-commerce 6 / 44

Linear factor models ◮ ( R 1 , ..., R J ) is continous. ◮ Linear factor models: α = ( α 1 , ..., α K ) R j = a ⊤ j α + ε j ◮ Principle component analysis 7 / 44

Categorical variable and item response theory model ◮ Binary R i ∈ { 0 , 1 } . e a ⊤ α − bj j ◮ P ( R j = 1 | α ) = α ∈ R K α − bj , a ⊤ 1+ e j 1.0 0.8 0.6 y 0.4 0.2 0.0 -4 -2 0 2 4 x 8 / 44

Stock Price Structure ◮ Data1: 97 stocks selected from S&P100 in 1013 trading days from 2009 to 2014. ◮ Data2: 117 stocks selected from SSE180 (Shanghai Stock Exchange) in 1159 trading days from 2009 to 2014. 9 / 44

Exploratory Analysis e.g. ◮ The block circled by blue contains mostly the energy companies: APA (Apache Corp), APC (Anadarko Petroleum), BHI (Baker Hughes), COP (Conoco Phillips), CVX (Chevron), DVN (Devon), ... ◮ The block circled by black contains the financial companies: The heatmap of stock-stock cor- C (citi), BAC (BOA), MS (Morgan Stanley), relation (Data 1; based on daily BK(Bank of New York Mellon), JPM (JP log return); stocks have been re- Morgan), ... ordered 10 / 44

Linear factor model ◮ Linear factor models R j = a ⊤ j α + ε j ◮ Fama-French model: R = R f + β ( K − R f ) + b s SMB + b v HML + α 11 / 44

Linear factor model ◮ ( R 1 , ..., R J ) is not multivariate Gaussian in many ways if J is large! ◮ Marginal tail, joint tail, asymmetric correlation... ◮ Too many factors! 12 / 44

Nonlinear factor model > S open ◮ Dichotomize R ji = 1 if S close for stock j on day i i i e a ⊤ j α − bj α ∈ R K ◮ P ( R j = 1 | α ) = j α − bj , a ⊤ 1+ e 13 / 44

Latent graphical model 14 / 44

Issues to concern ◮ Parametric/nonparametric models: latent variable and graph ◮ Inference: identifiability 17 / 44

The latent variable component – IRT model ◮ Alternative formulation: e a ⊤ j α − b j P ( R j | α ) ∝ e R j ( a ⊤ j α − b j ) P ( R j = 1 | α ) = ⇔ 1 + e a ⊤ j α − b j ◮ Local independence J j R j ( a ⊤ � � j α − b j ) P ( R 1 , ..., R J | α ) = P ( R j | α ) ∝ e j =1 18 / 44

Graphical component component – Ising model 1 � i , j s ij R i R j P ( R 1 , ..., R J | S ) ∝ e 2 ◮ Physics ◮ Graphical representation 19 / 44

Latent graphical model: IRT model + Ising model ◮ Nonlocal independence j R j ( a ⊤ j α − b j )+ 1 � � i , j s ij R i R j P ( R 1 , ..., R J | α ) ∝ e 2 ◮ Simplification: R 2 j = R j j R j a ⊤ j α + 1 � � i , j s ij R i R j P ( R 1 , ..., R J | α ) ∝ e 2 20 / 44

Latent variable and network modeling ◮ The item response function f A , S ( R | α ) ∝ exp { α ⊤ A R + 1 2 R ⊤ S R } where A K × J = ( a 1 , ..., a J ) and S J × J = ( s ij ) ◮ Population (prior) distribution such that f A , S ( R , α ) ∝ exp {−| α | 2 / 2 + α ⊤ A R + 1 2 R ⊤ S R } 21 / 44

Latent variable and network modeling ◮ Marginalized likelihood � f ( R , α ) d α ∝ exp { 1 2 R ⊤ ( A ⊤ A + S ) R } L ( A , S ) = ◮ Let L J × J = A ⊤ A L ( L , S ) = f ( R | L , S ) ∝ exp { 1 2 R ⊤ ( L + S ) R } 22 / 44

Identifiability ◮ Identifiability of L and S ◮ Low dimension latent factor : L J × J = A ⊤ A is positive semi-definite of rank K ≪ J ◮ Small remaining dependence S is sparse 23 / 44

Latent and Network Models with Applications to Finance Jingchen Liu - PowerPoint PPT Presentation

Latent and Network Models with Applications to Finance Jingchen Liu Department of Statistics Columbia University Joint work with Yunxiao Chen, Xiaoou Li, and Zhiliang Ying At ISFA-Columbia Workshop, June 28, 2016 November 16, 2015 1 / 44

1 Latent variable models In the next section we will discuss latent variable models for

Part III: Latent Tree Models Le Song ICML 2012 Tutorial on Spectral Algorithms for Latent

Applications: Network Models Network Models I Latent factors reflect disease model Single

Latent Variable Models CS3750 Xiaoting Li 1 Out utli line Latent Variable Models

Pengtao Xie Joint work with Yuntian Deng and Eric Xing Carnegie Mellon University 1 Latent

Learning Overcomplete Latent Variable Models through Tensor Methods Anima Anandkumar UC Irvine

Latent Class Models: The Latent Class Logit Model Accouting for unobserved heterogeneity:

Latent Variable Models Stefano Ermon, Aditya Grover Stanford University Lecture 6 Stefano

Network Applications Network Applications There are many network applications Network

C unobserved construct (e.g. Disordered v. Non- Disordered) Latent classes are mutually

Learning Latent Variable Models through Tensor Methods Anima Anandkumar U.C. Irvine Challenges

Models for Retrieval Models for Retrieval 1. HMM/N-gram-based 2. Latent Semantic Indexing (LSI)

Optimization-Based Model Fitting for Latent Class and Latent Profile Analyses Guan-Hua Huang,

Latent Damage and Reliability in Semiconductor Devices May1625 - Advisor & Client: Dr. Randy

Empirical Analysis of Latent Space Embedding David Mount and Eunhui Park Department of Computer

Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model CS330

Engineering Inspection Review Revision 1 (9/29/17) Charter Tasks Identify opportunities to

A Latent Variable Model of Synchronous Parsing for Syntactic and Semantic Dependencies James

Julia F. Slejko, PhD 1 Louis P. Garrison, PhD 1 Richard J. Willke, PhD 2 1 University of Washington

AN I NT RODUCT I ON T O L AT E NT CL ASS ANAL YSI S F OR PRE VE NT I ON RE SE

Modelling Measurement Error in Administrative and Survey Variables Sander Scholtus, Bart Bakker,

On the Many Claims and Applications of the Latent Variable Science is an attempt to exploit

Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model Yehuda Koren

Joint Modeling of Longitudinal Item Response Data and Survival Jean-Paul Fox University of

Latent and Network Models with Applications to Finance Jingchen Liu - PowerPoint PPT Presentation

Latent and Network Models with Applications to Finance Jingchen Liu Department of Statistics Columbia University Joint work with Yunxiao Chen, Xiaoou Li, and Zhiliang Ying At ISFA-Columbia Workshop, June 28, 2016 November 16, 2015 1 / 44

1 Latent variable models In the next section we will discuss latent variable models for

Part III: Latent Tree Models Le Song ICML 2012 Tutorial on Spectral Algorithms for Latent

Applications: Network Models Network Models I Latent factors reflect disease model Single

Latent Variable Models CS3750 Xiaoting Li 1 Out utli line Latent Variable Models

Pengtao Xie Joint work with Yuntian Deng and Eric Xing Carnegie Mellon University 1 Latent

Learning Overcomplete Latent Variable Models through Tensor Methods Anima Anandkumar UC Irvine

Latent Class Models: The Latent Class Logit Model Accouting for unobserved heterogeneity:

Latent Variable Models Stefano Ermon, Aditya Grover Stanford University Lecture 6 Stefano

Network Applications Network Applications There are many network applications Network

C unobserved construct (e.g. Disordered v. Non- Disordered) Latent classes are mutually

Learning Latent Variable Models through Tensor Methods Anima Anandkumar U.C. Irvine Challenges

Models for Retrieval Models for Retrieval 1. HMM/N-gram-based 2. Latent Semantic Indexing (LSI)

Optimization-Based Model Fitting for Latent Class and Latent Profile Analyses Guan-Hua Huang,

Latent Damage and Reliability in Semiconductor Devices May1625 - Advisor &amp; Client: Dr. Randy

Empirical Analysis of Latent Space Embedding David Mount and Eunhui Park Department of Computer

Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model CS330

Engineering Inspection Review Revision 1 (9/29/17) Charter Tasks Identify opportunities to

A Latent Variable Model of Synchronous Parsing for Syntactic and Semantic Dependencies James

Julia F. Slejko, PhD 1 Louis P. Garrison, PhD 1 Richard J. Willke, PhD 2 1 University of Washington

AN I NT RODUCT I ON T O L AT E NT CL ASS ANAL YSI S F OR PRE VE NT I ON RE SE

Modelling Measurement Error in Administrative and Survey Variables Sander Scholtus, Bart Bakker,

On the Many Claims and Applications of the Latent Variable Science is an attempt to exploit

Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model Yehuda Koren

Joint Modeling of Longitudinal Item Response Data and Survival Jean-Paul Fox University of

Latent Damage and Reliability in Semiconductor Devices May1625 - Advisor & Client: Dr. Randy