medium term downside
play

Medium-Term Downside Risk: Insights from Textual Analysis of News - PowerPoint PPT Presentation

Medium-Term Downside Risk: Insights from Textual Analysis of News Charles W. Calomiris and Harry Mamaysky Columbia Business School 1 Introduction Automated processing of natural language is opening a previously unavailable window into


  1. Medium-Term Downside Risk: Insights from Textual Analysis of News Charles W. Calomiris and Harry Mamaysky Columbia Business School 1

  2. Introduction • Automated processing of natural language is opening a previously unavailable window into market behavior • It may fundamentally transform finance practice • Prior work has been very short-term focused • But isn’t news (in aggregate) important for longer horizon outcomes? • We look at • Longer term country-level risk and return responses to news • How to measure news at the country level? 2

  3. Our approach and a peak at findings… • We develop a theory-neutral approach to map country news into market outcomes, which measures word flow and examines connections of word flow to risk and return. • We apply this (for the first time, we think) outside the U.S., to 52 countries. • EMs vs. DMs treated separately, given differences in returns processes. Key Findings : 1. Many measures relevant (sentiment, frequency, entropy), EMs/DMs differ. 2. Topical context matters. 3. Results change over time importantly. 4. News generally has opposite implications for return and risk. 5. Drawdown is useful as a measure of risk, especially for EMs. 6. We capture more than a popular a priori measure, in and out of sample . 3

  4. 1. Theory-neutral vs. a priori word identifiers What word flow? • Theory-neutral vs. a priori approaches (Baker Bloom Davis 2016) • Theory-neutral does not require advance knowledge of what is important, and avoids data mining risks. • But is it possible to construct a comprehensive, parsimonious, and flexible theory-neutral model of word flow? 4

  5. 2. What aspects of news are important? • Sentiment • Frequency • Unusualness (entropy) • Topical context interacted with above • How are topics different from EM to DM? • How does effect of news, and interpretation of news, differ by topic? 5

  6. 3. Regime changes over time? • Principal components indicate shift point around Global Crisis • A priori shift point lines up with second principal component • Out of sample properties of forecasting in light of this change 6

  7. 4. How to identify topical context? • Identifying topic-relevant words and their characteristics • Louvain method 7

  8. 5. Is all news relevant for both returns and risk? • Will we find opposite signs when an effect is statistically significant for return , if it is also statistically significant for sigma or drawdown ? 8

  9. 6. How to measure risk? • Especially in EMs, returns are not normal and there is momentum in returns. • In addition to sigma , we use drawdown (which allows longer term effects from momentum, skew, and kurtosis to be expressed). 9

  10. 7. How to analyze countries, together or not? • We separate EMs and DMs and analyze each as a panel. 10

  11. 8. What news source? • Thomson-Reuters provides a common platform, English language, and large sample of relevant countries, for which there are other data on returns and on various relevant variables. 11

  12. Text measures defined Data • Thomson-Reuters digital news archive from 1996 — 2015 • 5mm EM and 12mm DM articles • 52 countries (list next page) Text measures: • artcount – number of articles per country per month • entropy – “unusualness” of an article j (Glasserman and Mamaysky 2016) 𝐼 𝑘 = − 𝑞 𝑗 log 𝑛 𝑗 𝑗 ∈ {4−grams} • Effectively the average log probability of a word conditional on preceding words • sentiment – the difference of positive and negative words divided by total words in article j : 𝑘 = 𝑄𝑃𝑇 𝑘 − 𝑂𝐹𝐻 𝑘 𝑡 𝑏 𝑘 • Word sentiment comes from Loughran – McDonald dictionary 12

  13. 13

  14. Topics Intuition : Find groups of words that co-occur together in articles Details : • 1240 econ words • Start w/ 237 words from index of Beim and Calomiris (2001) and find other words, bigrams and trigrams from EM corpus based on cosine similarity • E.g.: barriers , currency , parliament , macroeconomist , and World Bank • We have 2 document term matrixes: • 5mm x 1,240 for EM and 12mm x 1,240 for DM • Compute cosine similarity matrix ( 1,240 x 1,240 ) • Then do community detection (using Louvain method for modularity maximization) • Out topics are mutually exclusive (not necessary) 14

  15. We find 5 topics for each group of countries • The Louvain algorithm returns ~40 word clusters with the following numbers of words • Place words from small clusters into big clusters 15

  16. Topics for EMs 16

  17. Topics for DMs 17

  18. Topic similarity across EM and DM 18

  19. Context specific sentiment • Let 𝑔 𝜐,𝑘 be the fraction of econ For a given country, we have 12 daily text measures: words in article j that are about topic τ • entropy • Topic sentiment is given by: • article count 𝑡 𝜐,𝑘 = 𝑔 𝜐,𝑘 × 𝑡 𝑘 • sMkt / fMkt • Aggregate the article level • sGovt / fGovt measures into daily measures (weighted by number of overall • sCorp / fCorp words in an article) • sComms / fComms • DM/EM specific: • sMacro / fMacro (EM) • sCredit / fCredit (DM) 19

  20. Principal Components EM EM Sentiment • For 140 EM sentiment series (28 countries x 5) we look at first 2 principal components • PC2 – relative sentiment of Markets to Government • Some evidence of a regime shift in PC2 a little before the financial crisis 20

  21. Principal components EM DM Sentiment • For 120 DM sentiment series (24 countries x 5) we look at first 2 principal components • PC2 – relative sentiment of Markets to Government (again!) • Some evidence of a regime shift in PC2 a little before the financial crisis 21

  22. Event Studies • High-frequency top and bottom deciles of sentiment • Middle as placebo • Returns lead major sentiment indicators at high frequency • Some post-event drift for positive and negative events 22

  23. Event studies – EM • Cumulative abnormal return around deciles of daily news events • Middle column is control for boring news • Some topics show post event drift: Mkt (both), Comms (negative) • This is very different from single name results, where there is little evidence of drift post negative news (only post positive)! 23

  24. Event studies – DM • Some topics show post event drift: Mkt (negative, both?), Corp (positive), Credit (both) 24

  25. Regression results • We run panel regression with dependent variables given by • return • return 12 • sigma • drawdown • We control for many variables that have been shown to have some forecasting power for future returns (next page) • The no-text measure regression is our Baseline model • All text measures (except entropy ) are normalized to unit variance • We run full sample, 1 st and 2 nd half of the sample 25

  26. Control variables 26

  27. Summary of regression results • News matters for EM and DM! • Results differ across EM and DM (e.g., artcount matters in EMs) • Baseline R 2 lower for EM • % increase in R 2 from text measures larger for EM • Sign of effects (i.e. good news or bad) almost always is consistent across return , sigma , and drawdown • Context matters: positive sentiment in Govt , Corp – bad news; positive sentiment in Mkt – good news • Incremental explanatory power largest for return 12 and drawdown ; explanatory power lower for return and sigma • Evidence of state dependence, especially for entropy • Goes from a “bad” pre-crisis to a “good” post-crisis 27

  28. 28

  29. 29

  30. Out-of-sample testing • Do we have too many explanatory variables? • What about regime shifts? • Check out-of-sample forecasting performance • Run rolling 5-year regressions in t- 60,…,t -1 for forecasting month t outcomes Lasso (least absolute shrinkage and selection operator) 1 2 + 𝜇 𝛾 1 ′ min 2𝑂 𝑧 𝑗,𝑢 − 𝑦 𝑗,𝑢−1 𝛾 𝛾 1,𝑢 • Lasso does shrinkage and model selection • Amount of shrinkage given by 𝛾 1 / 𝛾 𝑃𝑀𝑇 1 30

  31. Rolling lasso for DM drawdown 31

  32. Rolling lasso for EM drawdown 32

  33. Out-of-sample performance • Naïve model forecasts using country fixed effects • Base model includes only the non-text variables • CM includes all text measures • All models estimated using lasso 33

  34. Comparison to Baker, Bloom and Davis • The two types of measures are correlated. • BBD has incremental value over Baseline for three market measures only for DMs. • In the in-sample panel regressions, our measures subsume BBD for explaining return , sigma and drawdown . 34

  35. Out-of-sample comparisons to EPU • EPU counts articles from 10 major papers that contain triplets from uncertainty x economic x {policy terms} • For 5 EM and 11 DM countries where we have EPU data, compare out-of- sample performance of Base vs Base + alternative text measures 35

  36. Conclusions • Useful information in text for medium-term country-level outcomes (returns and cumulative downside risk) • Different dimensions of text matter • In particular, context matters for sentiment • Effects differ across EM and DM, and over time • Evidence of out-of-sample forecasting ability • Next: • Currency effects • Connect to GDP nowcasting (Jungian subconscious?), Fed Beige Book 36

Recommend


More recommend