taking r mainstream
play

Taking R Mainstream in Production Systems Misha Lisovich - PowerPoint PPT Presentation

Taking R Mainstream in Production Systems Misha Lisovich misha@honestbuildings.com The Question Q: Should I Use R in production? A: Yes! (In a couple of years) The Process 1. Productize - Compelling data products - Innovation pipeline 2.


  1. Taking R Mainstream in Production Systems Misha Lisovich misha@honestbuildings.com

  2. The Question Q: Should I Use R in production? A: Yes! (In a couple of years)

  3. The Process 1. Productize - Compelling data products - Innovation pipeline 2. Ruggedize - Toolchain: Rstudio, Devtools, Github, Travis CI, Docker - Strong testing - Production-ready Architecture 3. Assimilate - Command line tools - Make it into HTTP APIs - Make it into Docker containers

  4. Step 1: Productize Internal Products: - Ad-hoc Analyses - Internal Dashboards - Automated reports - Rapid Prototyping External Products: - End-user data products - Backend services

  5. 1. Dashboards Data & Job Monitoring Business Intelligence Internal Tools

  6. 2. Automated Reports = .Rmd -> html

  7. 3. Rapid Prototyping

  8. 4. Backend Services Batch Data Processing (ETL) R APIs

  9. 5. End-user Products

  10. Step 2: Ruggedize 1. Create reproducible architecture 2. Set up strong testing & CI 3. Separate Production and Dev 4. Set up monitoring & reporting

  11. Case Study: HB Architecture - Rstudio - Containerized Architecture - Continuous Integration - Multiple Environments - Notifications/Monitoring

  12. Data Architecture Containers Docker Compose Web elasticsearch: Shiny image: elasticsearch Server Shiny shiny-server: Elastic rAPI Server image: shiny Elastic ports: - "443:443" links: ETL + = SQL - elasticsearch SQL S3 etl: ETL image:etl data volumes: - .:/data ETL Rstudio rAPI ETL etl-data: Data Server image: etl-data

  13. Environments Production Staging staging-www.dataproduct.com www.dataproduct.com staging-internal-dashboards.com internal-dashboards.com Shiny Shiny Elastic Elastic Server Server SQL S3 SQL S3 data data ETL ETL volume volume

  14. Continuous Integration commit Github Travis CI latest-stable tag pull latest-stable pull latest-stable Success! Staging Production

  15. Docker Registry/Rolling Back Changes Deployed to Prod Save Versioned Image data Docker ETL volume Registry Danger! Need to Rollback! Load Older Image data Docker ETL volume Registry

  16. Step 3: Assimilate! ( i.e. , be kind to your devs)

  17. Assimilate (contd) - HTTP APIs - OpenCPU, rapier - Docker containers - Rocker - Command line tools - Rscript, littler, docopt

  18. Thank you! misha@honestbuildings.com

Recommend


More recommend