Step by step implementation and optimization of simulations in quantitative finance Lokman Abbas-Turki UPMC, LPMA 10-12 October 2017 Lokman (UPMC, LPMA) GTC Europe Lab 1 / 28
Plan Introduction Local volatility challenges GPUs Monte Carlo and OpenACC parallelization Monte Carlo and CUDA parallelization PDE formulation & Crank-Nicolson scheme PDE simulation and CUDA parallelization Lokman (UPMC, LPMA) GTC Europe Lab 2 / 28
Introduction Plan Introduction Local volatility challenges GPUs Monte Carlo and OpenACC parallelization Monte Carlo and CUDA parallelization PDE formulation & Crank-Nicolson scheme PDE simulation and CUDA parallelization Lokman (UPMC, LPMA) GTC Europe Lab 3 / 28
Introduction What make bankers change their mind? The CVA (Credit Valuation Adjustment) or XVA (X=C, D, F, K, M ◮ Valuation Adjustment) are tipping point applications, FRTB (Fundamental Review of the Trading Book) is the other important ◮ application with a deadline in 2019. Electronic trading and deep learning. ◮ Lloyds Blankfein declared about Goldman Sachs: “We are a technology ◮ firm”. HPC in banks From distribution to parallelization. ◮ From small to big nodes. ◮ The efficiency of GPUs becomes undeniable. ◮ Use the .net C, C++ and C#. ◮ Remaining challenges and fears Code management. ◮ Possible conflicts within quant teams. ◮ Can we extend the results for toy models to more general models? ◮ Lokman (UPMC, LPMA) GTC Europe Lab 4 / 28
Local volatility challenges GPUs Plan Introduction Local volatility challenges GPUs Monte Carlo and OpenACC parallelization Monte Carlo and CUDA parallelization PDE formulation & Crank-Nicolson scheme PDE simulation and CUDA parallelization Lokman (UPMC, LPMA) GTC Europe Lab 5 / 28
Local volatility challenges GPUs Local volatility from implied volatility First array: r g dS t = S t r g ( t ) dt + S t σ loc ( S t , t ) dW t , S 0 = x 0 . S is the stock price process where x 0 is the spot price ◮ W is a Brownian motion with W 0 = 0 ◮ r g is the risk-free rate, assumed piecewise constant ◮ σ loc ( x , t ) is a local volatility function: R ∗ + × R + → R ∗ ◮ + Dupire Equation Given a family C ( K , T ) K , T of call prices with strike K and maturity T loc ( K , T ) = 2 ∂ C /∂ T + Kr g ( T ) ∂ C /∂ K σ 2 K 2 ( ∂ 2 C /∂ K 2 ) From implied to Using the Black & Scholes implied volatility σ imp ( x , t ) , Andersen and local Brotherton-Ratcliffe (1997) showed that 2 ∂σ imp + σ imp + 2 xr g ( t ) ∂σ imp ∂ t t ∂ x σ 2 loc ( x , t ) = � � 2 � � ∂σ imp � 2 � √ ∂ 2 σ imp 1 1 ∂σ imp x 2 − d + + x √ t + d + t ∂ x 2 ∂ x σ imp ∂ x � � √ t + � t � √ t � with d + = 1 2 σ imp log ( x 0 / x ) + 0 r g ( u ) du / σ imp Lokman (UPMC, LPMA) GTC Europe Lab 6 / 28
Local volatility challenges GPUs SVI to simulate implied volatility + 2 arrays: T g , K g For ( x , t ) ∈ K g × T g , we assume that the observed implied volatility � w ( x , t ) σ imp ( x , t ) = t where the cumulative variance is parameterized by � � � t log ( x 1 + ρϕ ( θ t ) x 0 ) − 0 r g ( u ) du + w ( x , t ) = θ t �� � � � 2 (1) � t 2 log ( x ϕ ( θ t ) x 0 ) − 0 r g ( u ) du + ρ + ( 1 − ρ 2 ) t ( 1 + θ t ) 1 − γ and θ t = a 2 � � η t + b ( 1 − e − λ t ) with ϕ ( θ t ) = . θ γ All parameters of (1) are discussed in Gatheral & Jacquier (2013) Interpolation on We compute some values of σ imp ( x , t ) on the grid ∈ K g × T g . Then, we ] 0 , max ( K g )] × interpolate these values to obtain ¯ σ ( x , t ) defined on ] 0 , max ( K g )] × ] 0 , max ( T g )] and we assume ] 0 , max ( T g )] σ imp ( x , t ) ≈ ¯ σ ( x , t ) when ( x , t ) ∈ ] 0 , max ( K g )] × ] 0 , max ( T g )] Lokman (UPMC, LPMA) GTC Europe Lab 7 / 28
Local volatility challenges GPUs Bicubic interpolation for implied volatility 3 3 � � Let k , q with ( x , t ) C g ( k , q , i , j ) l i u j σ ( x , t ) = ¯ ∈ ] K g [ k ] , K g [ k + 1 ]] i = 0 j = 0 × ] T g [ q ] , T g [ q + 1 ]] t − T g [ q ] x − K g [ k ] where l = T g [ q + 1 ] − T g [ q ] and u = K g [ k + 1 ] − K g [ k ] Fourth array: C g C g ( k , q , i , j ) = C g [ k ∗ ( n t − 1 ) ∗ 16 + q ∗ 16 + i ∗ 4 + j ] with ( k , q , i , j ) ∈ { 0 , ..., n k − 1 } × { 0 , ..., n t − 1 } × { 0 , ..., 3 } × { 0 , ..., 3 } and n k , n t are the size of K g and T g Approximated local σ 2 σ 2 ( x , t ) , 0 . 0001 ) , 0 . 5 ) loc ( x , t ) ≈ min ( max ( � volatility 2 ∂ ¯ ∂ t + ¯ σ σ t + 2 xr g ( t ) ∂ ¯ σ ∂ x σ 2 ( x , t ) = where � � � 2 � � ∂ ¯ � 2 � ∂ 2 ¯ √ σ σ + 1 1 ∂ ¯ σ x 2 ∂ x 2 − d + x √ t + d + t ∂ x σ ¯ ∂ x � � σ √ t + σ √ t � t � � with d + = 1 2 ¯ log ( x 0 / x ) + 0 r g ( u ) du / ¯ Lokman (UPMC, LPMA) GTC Europe Lab 8 / 28
Monte Carlo and OpenACC parallelization Plan Introduction Local volatility challenges GPUs Monte Carlo and OpenACC parallelization Monte Carlo and CUDA parallelization PDE formulation & Crank-Nicolson scheme PDE simulation and CUDA parallelization Lokman (UPMC, LPMA) GTC Europe Lab 9 / 28
Monte Carlo and OpenACC parallelization Pricing bullet option with Monte Carlo (MC) � � � � T Price for t ∈ [ 0 , T ) F t = e − r g ( u ) du E ( S T − K ) + 1 { I T ∈ [ P 1 , P 2 ] } | F t with I t = 1 { S Ti < B } t T i ≤ t K , T are respectively the contract’s strike and maturity ◮ 0 < T 1 < ... < T M < T is a predetermined schedule ◮ The barrier B should be crossed I T times with { P 1 , ..., P 2 } ⊂ { 0 , ..., M } ◮ � T � ( S t , I t ) is Markov F ( t , x , j ) = e − r g ( u ) du E ( X � S t = x , I t = j ) , X = ( S T − K ) + 1 { I T ∈ [ P 1 , P 2 ] } t MC procedure Simulate F ( 0 , x 0 , 0 ) = E ( X ) using a family { X i } i ≤ n of i.i.d ∼ X Strong law of large numbers : ◮ � � X 1 + X 2 + ... + X n lim = E ( X ) = 1 P n → + ∞ n Central limit theorem : Denoting ǫ n = E ( X ) − X 1 + X 2 + ... + X n ◮ n √ n σ ǫ n → G ∼ N ( 0 , 1 ) , σ is the standard deviation of X There is a 95 % chance of having: ǫ n ≤ 1 . 96 σ √ n ◮ Lokman (UPMC, LPMA) GTC Europe Lab 10 / 28
Monte Carlo and OpenACC parallelization Discritization set 0 = t 0 < t 1 < ... < t N t = T finer than 0 < T 1 < ... < T M < T with δ t = √ t k + 1 − t k Iterating for path-dependant contract ( x 0 = 50) For each t k , 1 Random number generation (RNG) of independent Normal variables G i k = 0 , ..., N t − 1: 2 Stock price actualization S i t k + 1 = S i t k ( 1 + r g ( t k ) δ t δ t + σ loc ( S i t k , t k ) δ tG i ) 3 If t k + 1 = T l with l ∈ { 1 , ..., M } , I i T l = I i T l − 1 + ( S i T l < B ) Compute the payoff X i then average At t N t Lokman (UPMC, LPMA) GTC Europe Lab 11 / 28
Monte Carlo and OpenACC parallelization P. L’Ecuyer CMRG on GPU to generate uniformly distributed random variables General Form of Without loss of generality: linear RNGs X n − 1 mod ( m ) X n = ( AX n − 1 + C ) mod ( m ) = ( A : C ) . . . (3) 1 Parallel-RNG from * Pierre L’Ecuyer proposed a very efficient RNG (1996) which is a CMRG Period Splitting of on 32 bits: Combination of two Multiple Recursive Generator (MRG) with One RNG lag = 3 for each MRG. * Very long period ∼ 2 185 x n = ( a 1 x n − 1 + a 2 x n − 2 + a 3 x n − 3 ) mod ( m ) Pre-computations We launch as many parallel RNGs as the number of paths Use We prefer local variables to store the RNG’s state vector Lokman (UPMC, LPMA) GTC Europe Lab 12 / 28
Monte Carlo and OpenACC parallelization Some OpenACC pragmas: #pragma acc clause global variables clause = declare device_resident , applied to r g , K g , T g and C g . These arrays are known by all GPU functions copying to GPU clause = enter data copyin copies the values to the GPU embarrassingly clause = parallel loop present (+reduction(operation,arguments)) parallel making parallel the execution of the loop without data movement. reduction(operation,arguments) reduces all the private copies of arguments into one final result using operation Lokman (UPMC, LPMA) GTC Europe Lab 13 / 28
Monte Carlo and CUDA parallelization Plan Introduction Local volatility challenges GPUs Monte Carlo and OpenACC parallelization Monte Carlo and CUDA parallelization PDE formulation & Crank-Nicolson scheme PDE simulation and CUDA parallelization Lokman (UPMC, LPMA) GTC Europe Lab 14 / 28
Recommend
More recommend