Taking R Mainstream in Production Systems Misha Lisovich misha@honestbuildings.com
The Question Q: Should I Use R in production? A: Yes! (In a couple of years)
The Process 1. Productize - Compelling data products - Innovation pipeline 2. Ruggedize - Toolchain: Rstudio, Devtools, Github, Travis CI, Docker - Strong testing - Production-ready Architecture 3. Assimilate - Command line tools - Make it into HTTP APIs - Make it into Docker containers
Step 1: Productize Internal Products: - Ad-hoc Analyses - Internal Dashboards - Automated reports - Rapid Prototyping External Products: - End-user data products - Backend services
1. Dashboards Data & Job Monitoring Business Intelligence Internal Tools
2. Automated Reports = .Rmd -> html
3. Rapid Prototyping
4. Backend Services Batch Data Processing (ETL) R APIs
5. End-user Products
Step 2: Ruggedize 1. Create reproducible architecture 2. Set up strong testing & CI 3. Separate Production and Dev 4. Set up monitoring & reporting
Case Study: HB Architecture - Rstudio - Containerized Architecture - Continuous Integration - Multiple Environments - Notifications/Monitoring
Data Architecture Containers Docker Compose Web elasticsearch: Shiny image: elasticsearch Server Shiny shiny-server: Elastic rAPI Server image: shiny Elastic ports: - "443:443" links: ETL + = SQL - elasticsearch SQL S3 etl: ETL image:etl data volumes: - .:/data ETL Rstudio rAPI ETL etl-data: Data Server image: etl-data
Environments Production Staging staging-www.dataproduct.com www.dataproduct.com staging-internal-dashboards.com internal-dashboards.com Shiny Shiny Elastic Elastic Server Server SQL S3 SQL S3 data data ETL ETL volume volume
Continuous Integration commit Github Travis CI latest-stable tag pull latest-stable pull latest-stable Success! Staging Production
Docker Registry/Rolling Back Changes Deployed to Prod Save Versioned Image data Docker ETL volume Registry Danger! Need to Rollback! Load Older Image data Docker ETL volume Registry
Step 3: Assimilate! ( i.e. , be kind to your devs)
Assimilate (contd) - HTTP APIs - OpenCPU, rapier - Docker containers - Rocker - Command line tools - Rscript, littler, docopt
Thank you! misha@honestbuildings.com
Recommend
More recommend