Silicon Valley Big Data in Real Time: An Approach to Predictive - PowerPoint PPT Presentation

GPU Technology Conference 2015 Silicon Valley Big Data in Real Time: An Approach to Predictive Analytics for Alpha Generation and Risk Management Yigal Jhirad and Blay Tarnoff March 19, 2015

Table of Contents I. State-Space Models: State Instantiation — Portfolio and Risk Management: Big Data/Real Time Sensitivity II. Parallel Approach/Predictive Analytics — C luster Analysis : C oherence, C orrelation, C ointegration — Evolutionary Optimization III. Summary IV. Author Biographies DISCLAIMER: This presentation is for information purposes only. The presenter accepts no liability for the content of this presentation, or for the consequences of any actions taken on the basis of the information provided. Although the information in this presentation is considered to be accurate, this is not a representation that it is complete or should be relied upon as a sole resource, as the information contained herein is subject to change. 2

State-Space Models Portfolio/Risk Management  — Financials markets display discrete dynamic micro states and regimes – Traditional valuation techniques may not be as effective, e.g. Options markets – Implied Volatility vs. Jump Diffusion Regime Shifts — Big Data: Time Series Tick Data, Interday, and Intraday — Predictive Analytics: Identifying Structural Breaks — Alpha Generation, Risk Management, Market Impact Parallel Processing: APL/CPU and CUDA/GPU physics based modeling exploit hardware efficiently  — Array Based Processing paradigm Matrix/Vector thought process is key — APL is a programming language whose quantum data object is an array, which is fundamental to parallel processing and can leverage parallel processing across CPU’s — CUDA leverages GPU Hardware Application in Econometrics and Applied Mathematics  — Monte Carlo Simulations — Fourier Analysis — Principal Components — Optimization — Cluster Analysis — Cointegration 3

Cluster Analysis: Correlation & Cointegration Cluster Analysis: A multivariate technique designed to identify relationships and cohesion  — Factor Analysis, Risk Model Development Correlation Analysis: Pairwise analysis of data across assets. Each pairwise comparison can be run in parallel.  — Use Correlation or Cointegration as primary input to cluster analysis — Apply proprietary signal filter to remove selected data and reduce spurious correlations 4

Evolutionary Optimization Asymptotic Multi-Phase Optimization  — Identify a target portfolio of stocks that is trending consistently over consecutive periods using specialized, possibly time-sensitive, optimization algorithms — Establish a portfolio of stocks that is performing in a cohesive way Identifying and Assessing factors driving outperformance  — Optimize a basket of factors to track target portfolio — Look at factors such as Value vs. Growth, Large Cap vs. Small Cap Optimized Factor Attribution of Targeted Portfolio Relative Ranking Cash/Assets 1 Capex/Assets 2 Dividend Yield 3 Dividend Growth (1 and 5 Year) 4 Market Cap (High - Low) 5 Dividend Payout 6 ROIC 7 E/P 8 Indicative factor attribution of target portfolio Period: 2nd Half of 2013 5

Cluster Analysis: Correlation Application example: Correlation function removing N/A values Correlation measures the direction and strength of a linear relationship between variables. The Pearson  product moment correlation between two variables X and Y is calculated as: For N assets there are unique correlation pairs  Given an N x M matrix A in which each row is a list of returns for a particular equity, return an N x N  matrix R in which each element is the scalar result of correlating each row of A to every other — Each element of A may be an N/A value — When processing a pair of rows, the calculation must include neither each N/A value nor the corresponding element in the other row. This requires evaluating each pair separately. — As a result the increased computational demand is more effectively implemented through a parallel processing solution. As the matrix size increases the benefits of parallel processing become more significant. 6

Cluster Analysis : Configuration Tesla K20c Hardware Constraints  — Compute capability: 3.5 — Warp size: 32 — Number of shared memory banks: 32 — Coalescence capacity (4-byte words): 128 bytes = 32 contiguous words — Max blocks / warps / threads per multiprocessor: 16 / 64 / 2048 — Max registers per multiprocessor: 64K — Max shared memory per multiprocessor: 48K Software Constraints/Response  — Run 32 independent correlations per block for output coalescence — Read 32 prices/returns per correlation for input coalescence — Registers used per thread: 33 => max 15 blocks per SM @ 4 warps per block — Shared memory used per block: 256 + 384 per warp => max 16 blocks per SM @ 4 warps per block — Thread block configuration: 4 x 32 => 15 blocks / 60 warps / 1920 threads per SM 7

Cluster Analysis: Kernel Operation Each pair of rows produces a single correlation Output Input

Cluster Analysis: Kernel Correlation computation 𝑌− 𝑍− 𝑌 𝑍 2 𝑍− 𝑌− 2 𝑌 𝑍 Two passes through prices/returns in global memory required First pass: • Compute means of 𝒀 and 𝒁 𝒀, 𝒁 Second pass: • Subtract means from 𝒀 and 𝒁 𝒀 ́, 𝒁 ́ • S um 𝒀 ́ × 𝒁 ́, 𝒀 ́ × 𝒀 ́ and 𝒁 ́ × 𝒁 ́ 𝒀 ́𝒁 ́, 𝒀 ́ 2 , 𝒁 ́ 2 • M ultiply resultant 𝒀 ́𝒁 ́ by reciprocal square roots of 𝒀 ́ 2 × 𝒁 ́ 2

Cluster Analysis: Kernel First Pass Computation of means (row 33, col 0 of block grid) Repeat: read next section of  Prices/Returns into registers Ns necessary for NA  handling: “ 0 ” where either Xs or Ys is NA; “ 1 ” where Xs and Ys are both valid

Cluster Analysis: Kernel First Pass Computation of means (row 33, col 0 of block grid) Repeat: sum values in  registers into shared memory + + +

Cluster Analysis: Kernel First Pass Computation of means (row 33, col 0 of block grid) Sum rows to obtain totals for each row of Prices/Returns  Save totals to shared memory  ÷ ÷ Divide totals by N  ÷ ÷ Assets

Cluster Analysis: Kernel Second Pass Computation of correlations Read section of Prices/  Returns into registers (reuse registers from first pass) Ns necessary for NA  handling: “ 0 ” where either Xs or Ys is NA; “ 1 ” where Xs and Ys are both valid

Cluster Analysis: Kernel Second Pass Computation of correlations Subtract means of X and Y  − −

Cluster Analysis: Kernel Second Pass Computation of correlations Multiply and sum (reuse  × × X, Y, N from first pass) + +

Cluster Analysis: Kernel Second Pass Computation of correlations Multiply and sum (reuse  × X, Y, N from first pass) +

Cluster Analysis: Kernel Second Pass Computation of correlations Sum rows to obtain of terms for each row of Prices/Returns  𝑌 − 𝑍 − 𝑌 𝑍 𝑌 − 𝑍 − 𝑌 2 𝑍 2

Cluster Analysis: Kernel Second Pass Computation of correlations 𝑌 − 𝑍 − 𝑌 𝑍 𝑌 − 𝑍 − 𝑌 2 𝑍 2 Assets × × rsqrt(X2) rsqrt(Y2) XY

Cluster Analysis: Kernel Output to global memory : row 33, col 0 of block grid Assets Assets Assets

Cluster Analysis: Big Data, Real Time Host/Global Memory Pinned-mapped memory eliminates transfer complexity, reduces overhead  Kernel processing speed approximately 0.1 seconds for 586 assets × 120 periods  Output Input

Cluster Analysis: Big Data, Real Time Streaming data source Overview Data source constantly streaming tick data  Routine to Simple routine periodically aggregates ticks into VWAPs  periodically or returns, producing an array of prices, one per ticker aggregate ticks into VWAPs/returns loaded into one column of input  VWAPs Prices/Returns matrix in host memory at every interval Aggregation by period enables arbitrary real time speed,  limited by speed of GPU kernel algorithm G P U To target application

Summary State-Space model Instantiation using Intraday Time Series data for Predictive Analytics  Our approach is based on Time Series Cluster Analysis and Evolutionary Optimization  — Identifying Structural Breaks, Patterns, and Signals — Coherency and Group membership Applications in Portfolio Management for Alpha Generation and Risk Management  22

Silicon Valley Big Data in Real Time: An Approach to Predictive - PowerPoint PPT Presentation

GPU Technology Conference 2015 Silicon Valley Big Data in Real Time: An Approach to Predictive Analytics for Alpha Generation and Risk Management Yigal Jhirad and Blay Tarnoff March 19, 2015 Table of Contents I. State-Space Models: State

Educare of California at Silicon Valley (ECSV) A Center of Excellence in Education Dennis Cima

PV Technology Based on Crystalline Silicon Wafers Manufacturing of Crystalline Silicon Week 4.2

Sili Silicon Valley Silicon Valley ll Energy Storage Symposium Energy Storage Symposium gy g

Silicon Valley Advanced Water Purification Center | 1 Silicon Valley Advanced Water Purification

INNOVATION SILICON VALLEY, SEPTEMBER 9-13, 2019 Immersion Program in Silicon Valley 9th Edition A

The IoT Inc Business The IoT Inc Business Meetup Meetup Silicon Silicon Valley Valley Op

Silicon Labs Corporate Overview J A N U A R Y 2 0 2 0 The leader in silicon, software and

Proposed Newport, Washington Silicon Metal Facility - Private & Confidential -

Silicon Europe - your connection to innovative European SMEs! www.silicon-europe.eu Silicon

The Origins of Silicon Valley: Why and How It Happened Paul Wesling, H-P (retired), IEEE Life

Economic Summit Economic Summit Matthew Mahood President and CEO, The Silicon Valley

0 SILICON VALLEY CLEAN ENERGY 1 Item 3 PRESENTATION Outline RFP Objectives, Compliance

AI and Big Data For Smart City in Silicon Valley, USA - Issues, Solutions, and Challenges

0 SILICON VALLEY CLEAN ENERGY 1 Item 3 PRESENTATION What is Cybersecurity, Why is it important?

SILICON VALLEY 2.0 Climate Adaptation Partners Forum 4 MAY 2015 County of Santa Clara Office of

Fare Actions BART Board of Directors June 13, 2019 Silicon Valley Berryessa Extension Fares

Nort North Cen Central Illin Illinois ois Regional Analysis Presented by Melissa Henriksen to

Old Dominion University Virginia Modeling, Analysis & Simulation Center Enterprise Center,

NOVEC Customer Segmentation Analysis Anita Ahn Mesele Aytenifsu Bryan Barfield Daniel Kim

Employee-Paid Vanpool Program A Quick Recap Customized Solutions Tailored to You! EMPLOYEE

Presenta)on to Council

The use of Conjoint Analysis utility scores as cluster seeds: results based on a dry-cured ham

Performance Monitoring What is the CCPM? A self-assessment of cluster performance against the 6

Resi sident I Insi sights f s from the NMHC/ C/Ki Kingsley R Renter Preferences R Repor

Silicon Valley Big Data in Real Time: An Approach to Predictive - PowerPoint PPT Presentation

GPU Technology Conference 2015 Silicon Valley Big Data in Real Time: An Approach to Predictive Analytics for Alpha Generation and Risk Management Yigal Jhirad and Blay Tarnoff March 19, 2015 Table of Contents I. State-Space Models: State

Educare of California at Silicon Valley (ECSV) A Center of Excellence in Education Dennis Cima

PV Technology Based on Crystalline Silicon Wafers Manufacturing of Crystalline Silicon Week 4.2

Sili Silicon Valley Silicon Valley ll Energy Storage Symposium Energy Storage Symposium gy g

Silicon Valley Advanced Water Purification Center | 1 Silicon Valley Advanced Water Purification

INNOVATION SILICON VALLEY, SEPTEMBER 9-13, 2019 Immersion Program in Silicon Valley 9th Edition A

The IoT Inc Business The IoT Inc Business Meetup Meetup Silicon Silicon Valley Valley Op

Silicon Labs Corporate Overview J A N U A R Y 2 0 2 0 The leader in silicon, software and

Proposed Newport, Washington Silicon Metal Facility - Private &amp; Confidential -

Silicon Europe - your connection to innovative European SMEs! www.silicon-europe.eu Silicon

The Origins of Silicon Valley: Why and How It Happened Paul Wesling, H-P (retired), IEEE Life

Economic Summit Economic Summit Matthew Mahood President and CEO, The Silicon Valley

0 SILICON VALLEY CLEAN ENERGY 1 Item 3 PRESENTATION Outline RFP Objectives, Compliance

AI and Big Data For Smart City in Silicon Valley, USA - Issues, Solutions, and Challenges

0 SILICON VALLEY CLEAN ENERGY 1 Item 3 PRESENTATION What is Cybersecurity, Why is it important?

SILICON VALLEY 2.0 Climate Adaptation Partners Forum 4 MAY 2015 County of Santa Clara Office of

Fare Actions BART Board of Directors June 13, 2019 Silicon Valley Berryessa Extension Fares

Nort North Cen Central Illin Illinois ois Regional Analysis Presented by Melissa Henriksen to

Old Dominion University Virginia Modeling, Analysis &amp; Simulation Center Enterprise Center,

NOVEC Customer Segmentation Analysis Anita Ahn Mesele Aytenifsu Bryan Barfield Daniel Kim

Employee-Paid Vanpool Program A Quick Recap Customized Solutions Tailored to You! EMPLOYEE

Presenta)on to Council

The use of Conjoint Analysis utility scores as cluster seeds: results based on a dry-cured ham

Performance Monitoring What is the CCPM? A self-assessment of cluster performance against the 6

Resi sident I Insi sights f s from the NMHC/ C/Ki Kingsley R Renter Preferences R Repor

Proposed Newport, Washington Silicon Metal Facility - Private & Confidential -

Old Dominion University Virginia Modeling, Analysis & Simulation Center Enterprise Center,