Ho w do I find the bottleneck ? W R ITIN G E FFIC IE N T R C OD E Colin Gillespie J u mping Ri v ers & Ne w castle Uni v ersit y
WRITING EFFICIENT R CODE
Code profiling The general idea is to : R u n the code E v er y fe w milliseconds , record w hat is being c u rrentl y e x ec u ted Rprof() comes w ith R and does e x actl y this Trick y to u se Use prof v is instead WRITING EFFICIENT R CODE
IMDB data set From the ggplot 2 mo v ies package data(movies, package = "ggplot2movies") dim(movies) 58788 24 Data frame : aro u nd 60,000 ro w s and 24 col u mns Each ro w corresponds to a partic u lar mo v ie WRITING EFFICIENT R CODE
Bra v eheart braveheart = movies[7288,] Year Length Rating 1995 177 8.3 WRITING EFFICIENT R CODE
E x ample : Bra v eheart # Load data data(movies, + package = "ggplot2movies") braveheart <- movies[7288,] movies <- movies[movies$Action==1,] plot(movies$year, movies$rating, + xlab = "Year", ylab = "Rating") # local regression line model <- loess(rating ~ year, + data = movies) j <- order(movies$year) lines(movies$year[j], + model$fitted[j], + col = "forestgreen") points(braveheart$year, + braveheart$rating, + pch = 21, + bg = "steelblue") WRITING EFFICIENT R CODE
Prof v is RSt u dio has integrated s u pport for pro � ling w ith prof v is Highlight the code y o u w ant to pro � le Profile -> Profile Selected lines WRITING EFFICIENT R CODE
Command line library("profvis") profvis({ + data(movies, package = "ggplot2movies") # Load data + braveheart <- movies[7288,] + movies <- movies[movies$Action == 1,] + plot(movies$year, movies$rating, xlab = "Year", ylab="Rating") + model <- loess(rating ~ year, data = movies) # loess regression line + j <- order(movies$year) + lines(movies$year[j], model$fitted[j], col="forestgreen", lwd=2) + points(braveheart$year, braveheart$rating, + pch = 21, bg = "steelblue", cex = 3) + }) Which line do y o u think w ill be the slo w est ? WRITING EFFICIENT R CODE
WRITING EFFICIENT R CODE
WRITING EFFICIENT R CODE
Let ' s practice ! W R ITIN G E FFIC IE N T R C OD E
Prof v is W R ITIN G E FFIC IE N T R C OD E Colin Gillespie J u mping Ri v ers & Ne w castle Uni v ersit y
Monopol y 40 sq u ares 28 properties (22 streets + 4 stations + 2 u tilities ) Pla y ers take t u rns mo v ing b y rolling dice B uy ing properties Charging other pla y ers Sent to jail : three consec u ti v e do u bles in a single t u rn WRITING EFFICIENT R CODE
Monopol y Code Aro u nd 100 lines of code Simpli � ed game Reject the capitalist s y stem : no mone y No friends , onl y 1 pla y er simulate_monopoly(no_of_r WRITING EFFICIENT R CODE
WRITING EFFICIENT R CODE
WRITING EFFICIENT R CODE
Monopol y prof v is Ho w w o u ld y o u optimi z e this code ? WRITING EFFICIENT R CODE
Let ' s practice ! W R ITIN G E FFIC IE N T R C OD E
Monopol y recap W R ITIN G E FFIC IE N T R C OD E Colin Gillespie J u mping Ri v ers & Ne w castle Uni v ersit y
Data frames v s matrices # Original rolls <- data.frame(d1 = sample(1:6, 3, replace = TRUE), + d2 = sample(1:6, 3, replace = TRUE)) # Updated rolls <- matrix(sample(1:6, 6, replace = TRUE), ncol = 2) Total Monopol y sim u lation time : 2 seconds to 0.5 seconds Creating a data frame is slo w er than a matri x In the Monopol y sim u lation , w e created 10,000 data frames WRITING EFFICIENT R CODE
appl y v s ro w S u ms # Original total <- apply(df, 1, sum) # Updated total <- rowSums(df) 0.5 seconds to 0.16 seconds - 3 fold speed u p WRITING EFFICIENT R CODE
& v s && # Original is_double[1] & is_double[2] & is_double[3] # Updated is_double[1] && is_double[2] && is_double[3] Limited speed -u p 0.16 seconds to 0.15 seconds WRITING EFFICIENT R CODE
O v er v ie w Method Time ( secs ) Speed -u p Original 2.00 1.0 Matri x 0.50 4.0 Matri x + ro w S u ms 0.20 10.0 Matri x + ro w S u ms + && 0.19 10.5 WRITING EFFICIENT R CODE
Let ' s practice ! W R ITIN G E FFIC IE N T R C OD E
Recommend
More recommend