Stock Movement Prediction from Tweets and Historical Prices Yumo Xu - PowerPoint PPT Presentation

Stock Movement Prediction from Tweets and Historical Prices Yumo Xu and Shay B. Cohen Institute for Language, Cognition, and Computation School of Informatics, University of Edinburgh ACL, 2018. https://yumoxu.github.io/ , yumo.xu@ed.ac.uk 1 / 28

Who cares about stock movements? 2 / 28

Who cares about stock movements? No one would be unhappy if they could predict stock movements 2 / 28

Who cares about stock movements? No one would be unhappy if they could predict stock movements Investor Government Researcher 2 / 28

Background ◮ Two mainstreams in finance: technical and fundamental analysis 3 / 28

Background ◮ Two mainstreams in finance: technical and fundamental analysis ◮ Two main content resources in NLP: public news and social media 3 / 28

Background ◮ Two mainstreams in finance: technical and fundamental analysis ◮ Two main content resources in NLP: public news and social media ◮ History of NLP models Feature engineering (before 2010) ↓ Topic models (2013-2015) ↓ Event-driven neural nets (2014-2015) ↓ Hierarchical attention nets (2018) 3 / 28

Background ◮ Two mainstreams in finance: technical and fundamental analysis ◮ Two main content resources in NLP: public news and social media ◮ History of NLP models Feature engineering (before 2010) ↓ Topic models (2013-2015) Generative ↓ Event-driven neural nets (2014-2015) ↓ Hierarchical attention nets (2018) 4 / 28

Background ◮ Two mainstreams in finance: technical and fundamental analysis ◮ Two main content resources in NLP: public news and social media ◮ History of NLP models Feature engineering (before 2010) ↓ Topic models (2013-2015) Generative ↓ Event-driven neural nets (2014-2015) ↓ Hierarchical attention nets (2018) 5 / 28

However, it has never been easy... Complexities The market is highly stochastic, and we make temporally-dependent predictions from chaotic data. 6 / 28

Divide and Treat Chaotic market information 1 Noisy and heterogeneous High market stochasticity 2 Random-walk theory (Malkiel, 1999) Temporally-dependent prediction 3 When a company suffers from a major scandal on a trading day, its stock price will have a downtrend in the coming trading days Public information needs time to be absorbed into movements over time (Luss and d’Aspremont, 2015), and thus is largely shared across temporally-close predictions 7 / 28

Divide and treat Chaotic market information Market Information Encoder 1 Noisy and heterogeneous High market stochasticity Variational Movement Decoder 2 Random walk theory (Malkiel, 1999) Temporally-dependent prediction Attentive Temporal Auxiliary 3 When a company suffers from a major scandal on a trading day, its stock price will have a downtrend in the coming trading days Public information needs time to be absorbed into movements over time (Luss and d’Aspremont, 2015), and thus is largely shared across temporally-close predictions 8 / 28

Problem Formulation Stock Movement Prediction ◮ We estimate the binary movement where 1 denotes rise and 0 denotes fall ◮ Target trading day: d ◮ We use the market information comprising relevant tweets, and historical prices, in the lag [ d − ∆ d , d − 1 ] where ∆ d is a fixed lag size 9 / 28

Generative Process ◮ T eligible trading days in the ∆ d lag y X ◮ Encode observed market information as a random variable X = [ x 1 ; . . . ; x T ] φ Z θ |D| 10 / 28

Generative Process ◮ T eligible trading days in the ∆ d lag y X ◮ Encode observed market information as a random variable X = [ x 1 ; . . . ; x T ] ◮ Generate the latent driven factor Z = [ z 1 ; . . . ; z T ] φ Z θ |D| 10 / 28

Generative Process ◮ T eligible trading days in the ∆ d lag y X ◮ Encode observed market information as a random variable X = [ x 1 ; . . . ; x T ] ◮ Generate the latent driven factor Z = [ z 1 ; . . . ; z T ] φ Z θ ◮ Generate stock movements y = [ y 1 , . . . , y T ] from X , Z |D| 10 / 28

Primary components Market Information Encoder (MIE) 1 y X Encodes X Variational Movement Decoder (VMD) 2 Infers Z with X , y and decodes stock movements y from X , Z φ Z θ Attentive Temporal Auxiliary (ATA) 3 Integrates temporal loss for training |D| 12 / 28

StockNet architecture Training Objective α y 3 07/08 (c) Attentive Temporal Output y 1 y 2 Auxiliary (ATA) 03/08 06/08 h dec Variational decoder Temporal Attention z g 1 g 2 g 3 N ( µ, δ 2 ) k N (0 , I ) D KL � � h 1 h 2 h 3 (a) Variational Movement Decoder (VMD) z 1 z 2 z 3 µ " log δ 2 h enc N (0 , I ) Variational encoder Historical 02/08 03/08 06/08 Input (b) Market Information Prices Attention Attention Attention Encoder (MIE) (d) VAEs Bi-GRUs Message Embedding Layer Message Corpora 02/08 03/08 - 05/08 06/08 13 / 28

Variational Movement Decoder ◮ Goal: recurrently infer Z from X , y and decode y from X , Z ◮ Challenge: posterior inference is intractable in our factorized model 14 / 28

Variational Movement Decoder ◮ Goal: recurrently infer Z from X , y and decode y from X , Z ◮ Challenge: posterior inference is intractable in our factorized model VAE solutions ◮ Neural approximation and reparameterization ◮ Recurrent ELBO ◮ Adopt a posterior approximator q φ ( z t | z < t , x ≤ t , y t ) ∼ N ( µ, δ 2 I ) where φ = { µ, δ } 14 / 28

StockNet architecture Training Objective α y 3 07/08 (c) Attentive Temporal Output y 1 y 2 Auxiliary (ATA) 03/08 06/08 h dec Variational decoder Temporal Attention z g 1 g 2 g 3 N ( µ, δ 2 ) k N (0 , I ) D KL � � h 1 h 2 h 3 (a) Variational Movement Decoder (VMD) z 1 z 2 z 3 " log δ 2 µ h enc N (0 , I ) Variational encoder Historical 02/08 03/08 06/08 Input (b) Market Information Prices Attention Attention Attention Encoder (MIE) (d) VAEs Bi-GRUs Message Embedding Layer Message Corpora 02/08 03/08 - 05/08 06/08 15 / 28

Interface between VMD and ATA ˜ y T Training Objective ◮ Integrate the deterministic feature h t Temporal Attention 1 and the latent variable z t g t = tanh ( W g [ x t , h s t , z t ] + b g ) Dependency Score g T ◮ Decode movement hypothesis: first Information Score auxiliary targets, then main target ◮ Temporal attention: v ∗ g 2 g 1 g 3 16 / 28

Attentive Temporal Auxiliary ◮ Break down the approximated L to temporal objectives f ∈ R T × 1 f t = log p θ ( y t | x ≤ t , z ≤ t ) − λ D KL [ q φ ( z t | z < t , x ≤ t , y t ) � p θ ( z t | z < t , x ≤ t )] ◮ Reuse v ∗ to build the final temporal weight vector v ∈ R 1 × T v = [ α v ∗ , 1 ] where α ∈ [ 0 , 1 ] controls the overall auxiliary effects ◮ Recompose F N F ( θ, φ ; X , y ) = 1 � v ( n ) f ( n ) N n 17 / 28

Experimental setup ◮ Dataset Two-year daily price movements of 88 stocks Two components: a Twitter dataset and a historical price dataset Training: 20 months, 20,339 movements Development: 2 months, 2,555 movements Test: 2 months, 3,720 movements ◮ Lag window: 5 ◮ Metrics: accuracy and Matthews Correlation Coefficient (MCC) ◮ Comparative study: five baselines from different genres and five StockNet variations 18 / 28

Baselines and variants Baselines StockNet variants ◮ R AND : a naive predictor making ◮ H EDGE F UND A NALYST : fully-equipped random guess ◮ T ECHNICAL A NALYST : from only prices ◮ ARIMA: Autoregressive Integrated ◮ F UNDAMENTAL A NALYST : from only tweets Moving Average ◮ I NDEPENDENT A NALYST : optimizing only ◮ R AND F OREST (Pagolu et al., 2016) the main target ◮ TSLDA (Nguyen and Shirai, 2015) ◮ D ISCRIMINATIVE A NALYST : a discriminative ◮ HAN (Hu et al., 2018) variant 19 / 28

Results Baseline comparison Baseline models Acc. MCC ◮ The accuracy of 56% is generally 50.89 -0.002266 R AND 51.39 -0.020588 ARIMA reported as a satisfying result 53.08 0.012929 R AND F OREST (Nguyen and Shirai, 2015) 54.07 0.065382 TSLDA ◮ ARIMA : does not yield satisfying 57.64 0.051800 HAN results ◮ Two best baselines: TSLDA and HAN StockNet variations Acc. MCC Variant comparison 54.96 0.016456 T ECHNICAL A NALYST ◮ Two information sources are 58.23 0.071704 F UNDAMENTAL A NALYST integrated effectively 57.54 0.036610 I NDEPENDENT A NALYST 56.15 0.056493 D ISCRIMINATIVE A NALYST ◮ Generative framework incorporates 58.23 0.080796 H EDGE F UND A NALYST randomness properly 20 / 28

Stock Movement Prediction from Tweets and Historical Prices Yumo Xu - PowerPoint PPT Presentation

Stock Movement Prediction from Tweets and Historical Prices Yumo Xu and Shay B. Cohen Institute for Language, Cognition, and Computation School of Informatics, University of Edinburgh ACL, 2018. https://yumoxu.github.io/ , yumo.xu@ed.ac.uk 1 /

Stock Movement & Grazing Bylaw and Stock Underpass Policy Deliberations Report Outline of bylaw

Normalizing tweets with edit scripts and recurrent neural embeddings Grzegorz Chrupaa |

Filtering tweets AN ALYZ IN G S OCIAL MEDIA DATA IN R Vivek Vijayaraghavan Data Science Coach

Media Analysis of Social Network and Media Content 1 Three examples of data analysis 1. Tweets

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Branch Prediction Branch Prediction vs vs Execution Time Execution Time Prediction

Thailand Stock Market Roadshow Thailand Stock Market Roadshow Thailand Stock Market Roadshow

Prominence-based licensing in head movement and phrasal movement Brian Hsu LSA 2020 Annual

DIFFICULTIES IN CHILDREN Anna Barnett Everyday movement skills Everyday movement skills

Whose and What Chatter Matters? The Effect of Tweets on Movie Sales Huaxia Rui (joint work with

Understanding the Diversity of Tweets in the Time of Outbreaks Nattiya Kanhabua and Wolfgang

Using lasso and related estimators for prediction Di Liu StataCorp July 12, 2019 1 / 20

Prediction and Odds 18.05 Spring 2017 Probabilistic Prediction Also called probabilistic

Using Stata 16s lasso features for prediction and inference Di Liu StataCorp 1 / 50

CS 104 Computer Organization and Design Branch Prediction CS104:Branch Prediction 1 Branch

Convolutional feature extraction and Neural Arithmetic Logic Units for Stock Prediction Shangeth

People, Culture, Attitude Practical Proven People Performance Programs Charles Darwin

SOCIETE GENERALE PRESENTATION TO DEBT INVESTORS PRESENTATION TO DEBT INVESTORS June 2015

DEVELOPMENTS + REAL ESTATE. Creation of concepts and proyects Hello. We are Duvela Developments

Architecting The Unknown Grady Booch Email: gbooch@us.ibm.com IBM Fellow Twitter: @grady_booch

Mig Migran ant t foo ootb tballer allers s in na in nation tional al te teams ams A

L Language skills: kill LISTENING LISTENING Maria del Mar Surez Vilagran - 2008 LANGUAGE

Turn your whiteboard on To work fast and collaboratively Dry-erase whiteboards are utilized more

TEMPLE CITY UNIFIED SCHOOL DISTRICT TEMPLE CITY UNIFIED SCHOOL DISTRICT

Stock Movement Prediction from Tweets and Historical Prices Yumo Xu - PowerPoint PPT Presentation

Stock Movement Prediction from Tweets and Historical Prices Yumo Xu and Shay B. Cohen Institute for Language, Cognition, and Computation School of Informatics, University of Edinburgh ACL, 2018. https://yumoxu.github.io/ , yumo.xu@ed.ac.uk 1 /

Stock Movement &amp; Grazing Bylaw and Stock Underpass Policy Deliberations Report Outline of bylaw

Normalizing tweets with edit scripts and recurrent neural embeddings Grzegorz Chrupaa |

Filtering tweets AN ALYZ IN G S OCIAL MEDIA DATA IN R Vivek Vijayaraghavan Data Science Coach

Media Analysis of Social Network and Media Content 1 Three examples of data analysis 1. Tweets

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Branch Prediction Branch Prediction vs vs Execution Time Execution Time Prediction

Thailand Stock Market Roadshow Thailand Stock Market Roadshow Thailand Stock Market Roadshow

Prominence-based licensing in head movement and phrasal movement Brian Hsu LSA 2020 Annual

DIFFICULTIES IN CHILDREN Anna Barnett Everyday movement skills Everyday movement skills

Whose and What Chatter Matters? The Effect of Tweets on Movie Sales Huaxia Rui (joint work with

Understanding the Diversity of Tweets in the Time of Outbreaks Nattiya Kanhabua and Wolfgang

Using lasso and related estimators for prediction Di Liu StataCorp July 12, 2019 1 / 20

Prediction and Odds 18.05 Spring 2017 Probabilistic Prediction Also called probabilistic

Using Stata 16s lasso features for prediction and inference Di Liu StataCorp 1 / 50

CS 104 Computer Organization and Design Branch Prediction CS104:Branch Prediction 1 Branch

Convolutional feature extraction and Neural Arithmetic Logic Units for Stock Prediction Shangeth

People, Culture, Attitude Practical Proven People Performance Programs Charles Darwin

SOCIETE GENERALE PRESENTATION TO DEBT INVESTORS PRESENTATION TO DEBT INVESTORS June 2015

DEVELOPMENTS + REAL ESTATE. Creation of concepts and proyects Hello. We are Duvela Developments

Architecting The Unknown Grady Booch Email: gbooch@us.ibm.com IBM Fellow Twitter: @grady_booch

Mig Migran ant t foo ootb tballer allers s in na in nation tional al te teams ams A

L Language skills: kill LISTENING LISTENING Maria del Mar Surez Vilagran - 2008 LANGUAGE

Turn your whiteboard on To work fast and collaboratively Dry-erase whiteboards are utilized more

TEMPLE CITY UNIFIED SCHOOL DISTRICT TEMPLE CITY UNIFIED SCHOOL DISTRICT

Stock Movement & Grazing Bylaw and Stock Underpass Policy Deliberations Report Outline of bylaw