Directions in Statistical Computing 2014 Renjin's JIT Thinking - PowerPoint PPT Presentation

Directions in Statistical Computing 2014 Renjin's JIT Thinking about R as a Query Language Alexander Bertram BeDataDriven 2014 1

Quick Intro: Renjin ● R-language Interpreter written in Java, uses GNU R core packages (base, stats, etc) as-is ● Goals: Completeness first, performance next ● C/Fortran: Supported with translator and emulation layer ● Can run roughly ~50% of CRAN packages (see packages.renjin.org) ● Actively user group, diverse 2014 2

R as a “Query Language” How can R be as fast as Fortran or C++ ? How can R be more like SQL? – Analyst describes the what – Query planner determines the how ● Implicit parallelism ● Target diverse architechture (in-memory, single node, clusters) 2014 3

Is R dynamic? Argument: Not where/when performance matters 2014 4

“But R is too dynamic!” airlines <- read.bigtable(“airlines”) Complicated print(nrow(airlines)) # ~240m Argument Matching fit.exp <- function(x, max.iter = 10 ) { rate <- 1 / mean(x) repeat { loglik <- sum (-dexp(r = rate, x = lambda, log = T) if( goodEnough(loglik) ) break rate <- next } } sum() is group Is the break() generic, function dispatches based redefined? on argument 2014 5

airlines <- read.bigtable(“airlines”) delay <- airlines$delay[airlines$ delay > 30] dexp <- function ( x , rate=1, log = FALSE) { mean <- 1/rate d <- exp(- x / mean ) / mean if(log) return(log( d )) d } fit.exp <- function ( x , max.iter = 10 ) { rate <- 1 / mean( x ) repeat { loglik <- sum(-dexp(r = rate , x , log = T) if( logLik > epsilon ) break rate <- update(rate) } } rate <- fit.exp 2014 6

Real world example: Distance Correlation [ see energy package] 2014 7

2014 8

Optimizations: Views x <- dist(x) y <- dist(y) x <- as.matrix(x) y <- as.matrix(y) # GNU R: x^2 + y^2 memory alloc'd # Renjin: ~ 0 2014 9

DistanceMatrix public class DistanceMatrix extends DoubleVector { private Vector vector; public double getElementAsDouble(int index) { int size = vector.length(); int row = index % size; int col = index / size; if(row == col) { return 0; } else { double x = vector.getElementAsDouble(row); double y = vector.getElementAsDouble(col); return Math.abs(x - y); } } public int length() { return vector.length() * vector.length(); } } 2014 10

Deferred Evalution ● Defer computation of pure functions when inputs exceed some threshold: x <- (1:100) + 4 # x is computed y <- (1:e^6) + 4 # no work done # x is a view z <- y – mean(z) z <- dnorm(z) print(z) # triggers evaluation 2014 11

2014 12

Query Planner ● Once evaluation is triggered: we have a better broad view of the calcuation to be completed ● Computation Graph is essentially a pure function ● We can reorder operations, and easily see which branches can be evaluated independently, in parallel 2014 13

2014 14

Loop Fusion mean(op1(op2(op3(x))) transformed to... double sum = 0; for(int i..1000) { sum += op1(op2(op3)) } 2014 15

Beyond Bytecode JVM Byte Code → JVM Byte Code → Native Machine Code Native Machine Code SQL Query OpenCL 2014 16

Results 2014 17

2014 18

Loops! m <- 4 for (i in 1:m) { x = exp (tanh (a^2 * (b^2 + i/m))) r[i%%10+1] = r[i%%10+1] + sum(x) } Kaboom! (thanks Radford!) 2014 19

Loops! ● R gives you the flexibility to mix imperative with functional approaches ● In many dynamic languages (JS, Ruby), sophisticated runtime analysis is required to identify and compile hotspots in the code. ● In R, they're pretty easy to spot: x <- 1:1e6 for(i in seq_along(x)) { ... } 2014 20

for (i in 1:m) { x = exp (tanh (a^2 * (b^2 + i/m))) r[i%%10+1] = r[i%%10+1] + sum(x) } BB4: [L2] BB3: [L1] ₃ ₂ Λ0 ← increment counter Λ0 ₂ ₃ ₂ i ← τ [Λ0 ] BB1: goto L0 ₄ ₀ ₃ ₀ τ ← (^ a 2.0d) τ ← (: 1.0d m ) ₅ ₀ τ ← (^ b 2.0d) Λ0 ← 0 ₁ τ ← (/ i m ) ₆ ₂ ₀ BB5: [L3] τ ← length(τ ) ₂ ₃ τ ← (+ τ τ ) ₇ ₅ ₆ return NULL τ ← (* τ τ ) ₈ ₄ ₇ BB2: [L0] ₉ ₈ τ ← (tanh τ ) r ← Φ(r , r ) ₁ ₀ ₂ ₂ ₉ x ← (exp τ ) ₂ ₁ ₃ Λ0 ← Φ(Λ0 , Λ0 ) ₁₀ ₂ τ ← (%% i 10.0d) ₁ ₀ ₂ i ← Φ(i , i ) ₁₁ ₁₀ τ ← (+ τ 1.0d) ₁ ₀ ₂ x ← Φ(x , x ) τ ← ([ r τ ) ₁₂ ₁ ₁₁ if Λ0 >= τ => TRUE:L3, ₂ ₂ τ ← (sum x ) ₁₃ ₂ FALSE:L1, NA:ERROR τ ← (%% i 10.0d) ₁₄ ₂ ₁₅ ₁₄ τ ← (+ τ 1.0d) ₁₆ ₂ τ ← (%% i 10.0d) ₁₇ ₁₆ τ ← (+ τ 1.0d) r ← ([<- r τ ) ₂ ₁ ₁₇ 2014 21

Compared to other dynamic languages? ● Argument: Speculative specialization works very well for long-running code, but unnecessary for most statistical code with many loops: – Simulations – Iterative algorithms – ? ● Needs to be tested... 2014 22

packages.renjin.org 2014 23

Developing CI + benchmarking system for testing optimizations 2014 24

More Information ● http://www.renjin.org ● http://packages.renjin.org ● http://docs.renjin.org/en/latest/ 2014 25

Directions in Statistical Computing 2014 Renjin's JIT Thinking - PowerPoint PPT Presentation

Directions in Statistical Computing 2014 Renjin's JIT Thinking about R as a Query Language Alexander Bertram BeDataDriven 2014 1 Quick Intro: Renjin R-language Interpreter written in Java, uses GNU R core packages (base, stats, etc)

Just-In-Time (JIT) Motivation JIT Philosophy JIT Procedure Toyota Kanban Systems

Renjin: The new R interpreter built on the JVM What? Renjin is a new interpreter for the R

JIT Compilation Module Overview JIT Compilation Native vs. Managed Compilation Managed

Superinstructions and Replication in the Cacao JVM interpreter M. Anton Ertl Christian Thalinger

ORC LLVMs Next Generation of JIT API Contents LLVM JIT APIs Past, Present and Future I

JVM Optimization 101 Sebastian Zarnekow itemis Static vs Dynamic Compilation AOT vs JIT JIT

G. G. Stokes 1857 Stokes diagram with Stokes directions Halo at with singular directions

G. G. Stokes 1857 Stokes diagram with Stokes directions Halo at with singular directions

Integration of Health and Social Care Simon Carr, Housing Team,JIT JIT is a strategic

LLV8: LLV8: Adding Adding LLVM LLVM as as an an extra extra JIT tier to V8 JavaScript engine

Parham Solaimani, Ph.D. BeDataDriven BV The Hague, The Netherlands What is Renjin R interpreter

FTL WebKits LLVM based JIT Andrew Trick, Apple Juergen Ributzka, Apple LLVM Developers

Future Directions in High Future Directions in High P Performance Computing Performance

Comparing TensorFlow 2.0 with PyTorch and PyTorch JIT Tim Lazarus 29 November, 2019 Comparing

Higgs A monitoring JIT for JavaScript Maxime Chevalier-Boisvert Dynamic Language Team

Performance Potential of Optimization Phase Selection During Dynamic JIT Compilation Michael R.

Challenges in Open-source RISC-V Implementations Differentiation & Customization Open Source

Preliminary Multiprocessor Support of Ada 2012 in GNU/Linux Systems Sergio Sez

Testing AutoFDO for Geant4 Nathalie Rauschmayr IT-CF-FPP With help from Benedikt Hegner and

Optimized Binary64 and Binary128 Arithmetic with GNU MPFR (common work with Vincent Lefvre)

GMP 6.2.0 Installation GMP 6.2.0 Installation gcc/g++ g /g Source download:

1 Profile:- Comprompt Solutions LLP (formerly known as Comprompt Solutions from 2000 to 2017)

SQL$Joins Max$Masnick August&7,&2015 What%are%joins?

Sustainable way of testing your code by Eugene Amirov Teamlead at Scrapinghub For top 100 most

Directions in Statistical Computing 2014 Renjin's JIT Thinking - PowerPoint PPT Presentation

Directions in Statistical Computing 2014 Renjin's JIT Thinking about R as a Query Language Alexander Bertram BeDataDriven 2014 1 Quick Intro: Renjin R-language Interpreter written in Java, uses GNU R core packages (base, stats, etc)

Just-In-Time (JIT) Motivation JIT Philosophy JIT Procedure Toyota Kanban Systems

Renjin: The new R interpreter built on the JVM What? Renjin is a new interpreter for the R

JIT Compilation Module Overview JIT Compilation Native vs. Managed Compilation Managed

Superinstructions and Replication in the Cacao JVM interpreter M. Anton Ertl Christian Thalinger

ORC LLVMs Next Generation of JIT API Contents LLVM JIT APIs Past, Present and Future I

JVM Optimization 101 Sebastian Zarnekow itemis Static vs Dynamic Compilation AOT vs JIT JIT

G. G. Stokes 1857 Stokes diagram with Stokes directions Halo at with singular directions

G. G. Stokes 1857 Stokes diagram with Stokes directions Halo at with singular directions

Integration of Health and Social Care Simon Carr, Housing Team,JIT JIT is a strategic

LLV8: LLV8: Adding Adding LLVM LLVM as as an an extra extra JIT tier to V8 JavaScript engine

Parham Solaimani, Ph.D. BeDataDriven BV The Hague, The Netherlands What is Renjin R interpreter

FTL WebKits LLVM based JIT Andrew Trick, Apple Juergen Ributzka, Apple LLVM Developers

Future Directions in High Future Directions in High P Performance Computing Performance

Comparing TensorFlow 2.0 with PyTorch and PyTorch JIT Tim Lazarus 29 November, 2019 Comparing

Higgs A monitoring JIT for JavaScript Maxime Chevalier-Boisvert Dynamic Language Team

Performance Potential of Optimization Phase Selection During Dynamic JIT Compilation Michael R.

Challenges in Open-source RISC-V Implementations Differentiation &amp; Customization Open Source

Preliminary Multiprocessor Support of Ada 2012 in GNU/Linux Systems Sergio Sez

Testing AutoFDO for Geant4 Nathalie Rauschmayr IT-CF-FPP With help from Benedikt Hegner and

Optimized Binary64 and Binary128 Arithmetic with GNU MPFR (common work with Vincent Lefvre)

GMP 6.2.0 Installation GMP 6.2.0 Installation gcc/g++ g /g Source download:

1 Profile:- Comprompt Solutions LLP (formerly known as Comprompt Solutions from 2000 to 2017)

SQL$Joins Max$Masnick August&amp;7,&amp;2015 What%are%joins?

Sustainable way of testing your code by Eugene Amirov Teamlead at Scrapinghub For top 100 most

Challenges in Open-source RISC-V Implementations Differentiation & Customization Open Source

SQL$Joins Max$Masnick August&7,&2015 What%are%joins?