s91030 hybrid machine learning with the kubeflow
play

S91030 - Hybrid Machine Learning with the Kubeflow Pipelines and - PowerPoint PPT Presentation

S91030 - Hybrid Machine Learning with the Kubeflow Pipelines and RAPIDS Sina Chavoshi Cloud AI Strategy: The right approach for the right problem Building blocks Platform Solutions Cloud AI Strategy: The right approach for the right


  1. S91030 - Hybrid Machine Learning with the Kubeflow Pipelines and RAPIDS Sina Chavoshi

  2. Cloud AI Strategy: The right approach for the right problem Building blocks Platform Solutions

  3. Cloud AI Strategy: The right approach for the right problem Building blocks Platform Solutions

  4. Building Blocks Sight Language Conversation

  5. Cloud AI Strategy: The right approach for the right problem Building blocks Platform Solutions

  6. Solutions / Contact Center Google Cloud Contact Center AI Phone Backend Fulfillment Contact Center Contact Center Virtual Provider Interface Agent Customer Knowledge Base (PDF/HTML) Agent Chat Assist Agent Virtual Agent

  7. Cloud AI Strategy: The right approach for the right problem Building blocks Platform Solutions

  8. Cloud AI Platform Data pipeline Model development Cloud Cloud ML BigQuery Dataprep Engine Model deployment and management Cloud Cloud Dataflow Dataproc Cloud ML Cloud Kubeflow Engine Kubernetes Engine Services Tools Community Jupyter ASL Notebooks

  9. Building & deploying real-life ML applications is hard and costly because of lack of tooling that covers end-to-end ML development & deployment.

  10. In addition to the actual ML... ML Code

  11. You have to worry about so much more. Data Monitoring Verification Configuration Data Collection Analysis Tools ML Code Serving Process Management Machine Infrastructure Tools Resource Feature Extraction Management Source: Sculley et al.: Hidden Technical Debt in Machine Learning Systems

  12. AI problems today Problems Solutions Deployment Brittle, opinionated infrastructure that is hard to 01 productionize and breaks between cloud and on-prem Talent 02 02 Machine Learning expertise is scarce Reusable pipelines Collaboration 03 03 Difficult to find, leverage existing solutions

  13. 01: Kubeflow ML microservices Scalable ML services on Kubernetes Easy to get started • Out-of-box support for top frameworks Training Predict – pytorch, caffe, tf and xgboost • Kubernetes manages dependencies, resources Cloud Swappable & scalable • Library of ML services • GPU support • Massive scale Training Predict Meet customer where they are • GCP On-prem • On-prem with Cisco

  14. RAPIDS Product Overview

  15. THE BIG PROBLEM IN DATA SCIENCE Manage Data Training Evaluate Deploy Data Model All Structured ETL Visualization Scoring Preparation Training Data Data Store Slow Training Times for Data Scientists

  16. RAPIDS — OPEN GPU DATA SCIENCE Software Stack Python Data Preparation Model Training Graph Analytics cuDF cuML cuGRAPH PYTHON DEEP LEARNING FRAMEWORKS RAPIDS DASK/SPARK CUDF CUML CUGRAPH CUDNN CUDA APACHE ARROW on GPU Memory

  17. BENCHMARKS cuIO/cuDF — Load and Data Preparation cuML — XGBoost End-to-End Time in seconds — Shorter is better cuIO / cuDF (Load and Data Preparation) Data Conversion XGBoost Benchmark CPU Cluster Configuration DGX Cluster Configuration 200GB CSV dataset; Data preparation CPU nodes (61 GiB of memory, 8 vCPUs, 5x DGX-1 on InfiniBand network includes joins, variable 64-bit platform), Apache Spark transformations.

  18. AI Hub & Pipelines: Fast & simple adoption of AI The Flywheel of AI Adoption 1. Search & Discover 2. Deploy Find best-of-breed solutions on the AI Quick 1-click implementation of ML Hub which leverage Cloud AI solutions pipelines onto Google Cloud Platform . 5. Publish 3. Customize Network Upload & share pipelines running best Experiment and adjustment effect within your org or publicly. out-of-the-box pipelines to custom use cases. 4. Run in production Deploy customized pipelines in production.

  19. 02: Reusable Pipelines Enable developers to build custom ML applications by easily “stitching” and connecting various components. • Reuse instead of reimplement or reinvent • Discover, learn and replicate successful pipelines

  20. What constitutes a Kubeflow Pipeline ● Containerized implementations of ML Tasks ○ Containers provide portability, repeatability and encapsulation ○ A task can be single node or *distributed* ○ A containerized task can invoke other services ● Specification of the sequence of steps ○ Specified via Python SDK ● Input Parameters ○ A “Job” = Pipeline invoked w/ specific parameters

  21. 03: AI Hub at a glance All AI content in one place 1 Quick discovery of plug & play AI pipelines & other content built by teams across Google and by partners and customers. Fast & simple implementation of AI on GCP 2 One-click deployment of AI pipelines via Kubeflow on GCP as the go-to platform for AI + hybrid & on premise. Enterprise-grade internal & external sharing 3 Foster reuse by sharing deployable AI pipelines & other content privately within organizations & publicly.

  22. Mission The one place for everything AI, from experimentation to production.

  23. Public and private AI Hub + Private content Public content By Google By partners By customers Unique AI assets by Google Created, shared & monetized Content shared securely within and with other organizations by anyone AutoML, TPUs, Cloud AI Platform, etc.

  24. Kubeflow Pipelines enable Workflow Rapid reliable Share, re-use & orchestration experimentation compose

  25. Demo

  26. Visual depiction of pipeline topology

  27. View all current and historical runs, grouped as “Experiments”

  28. Rich visualizations of metrics

  29. Clone an existing pipeline

  30. Access to all config params, inputs and outputs for each run

  31. Update parameters and submit

  32. Easy comparison of Runs

  33. Easy comparison of Runs

  34. That’s a wrap.

Recommend


More recommend