S91030 - Hybrid Machine Learning with the Kubeflow Pipelines and - PowerPoint PPT Presentation

S91030 - Hybrid Machine Learning with the Kubeflow Pipelines and RAPIDS Sina Chavoshi

Cloud AI Strategy: The right approach for the right problem Building blocks Platform Solutions

Building Blocks Sight Language Conversation

Solutions / Contact Center Google Cloud Contact Center AI Phone Backend Fulfillment Contact Center Contact Center Virtual Provider Interface Agent Customer Knowledge Base (PDF/HTML) Agent Chat Assist Agent Virtual Agent

Cloud AI Platform Data pipeline Model development Cloud Cloud ML BigQuery Dataprep Engine Model deployment and management Cloud Cloud Dataflow Dataproc Cloud ML Cloud Kubeflow Engine Kubernetes Engine Services Tools Community Jupyter ASL Notebooks

Building & deploying real-life ML applications is hard and costly because of lack of tooling that covers end-to-end ML development & deployment.

In addition to the actual ML... ML Code

You have to worry about so much more. Data Monitoring Verification Configuration Data Collection Analysis Tools ML Code Serving Process Management Machine Infrastructure Tools Resource Feature Extraction Management Source: Sculley et al.: Hidden Technical Debt in Machine Learning Systems

AI problems today Problems Solutions Deployment Brittle, opinionated infrastructure that is hard to 01 productionize and breaks between cloud and on-prem Talent 02 02 Machine Learning expertise is scarce Reusable pipelines Collaboration 03 03 Difficult to find, leverage existing solutions

01: Kubeflow ML microservices Scalable ML services on Kubernetes Easy to get started • Out-of-box support for top frameworks Training Predict – pytorch, caffe, tf and xgboost • Kubernetes manages dependencies, resources Cloud Swappable & scalable • Library of ML services • GPU support • Massive scale Training Predict Meet customer where they are • GCP On-prem • On-prem with Cisco

RAPIDS Product Overview

THE BIG PROBLEM IN DATA SCIENCE Manage Data Training Evaluate Deploy Data Model All Structured ETL Visualization Scoring Preparation Training Data Data Store Slow Training Times for Data Scientists

RAPIDS — OPEN GPU DATA SCIENCE Software Stack Python Data Preparation Model Training Graph Analytics cuDF cuML cuGRAPH PYTHON DEEP LEARNING FRAMEWORKS RAPIDS DASK/SPARK CUDF CUML CUGRAPH CUDNN CUDA APACHE ARROW on GPU Memory

BENCHMARKS cuIO/cuDF — Load and Data Preparation cuML — XGBoost End-to-End Time in seconds — Shorter is better cuIO / cuDF (Load and Data Preparation) Data Conversion XGBoost Benchmark CPU Cluster Configuration DGX Cluster Configuration 200GB CSV dataset; Data preparation CPU nodes (61 GiB of memory, 8 vCPUs, 5x DGX-1 on InfiniBand network includes joins, variable 64-bit platform), Apache Spark transformations.

AI Hub & Pipelines: Fast & simple adoption of AI The Flywheel of AI Adoption 1. Search & Discover 2. Deploy Find best-of-breed solutions on the AI Quick 1-click implementation of ML Hub which leverage Cloud AI solutions pipelines onto Google Cloud Platform . 5. Publish 3. Customize Network Upload & share pipelines running best Experiment and adjustment effect within your org or publicly. out-of-the-box pipelines to custom use cases. 4. Run in production Deploy customized pipelines in production.

02: Reusable Pipelines Enable developers to build custom ML applications by easily “stitching” and connecting various components. • Reuse instead of reimplement or reinvent • Discover, learn and replicate successful pipelines

What constitutes a Kubeflow Pipeline ● Containerized implementations of ML Tasks ○ Containers provide portability, repeatability and encapsulation ○ A task can be single node or *distributed* ○ A containerized task can invoke other services ● Specification of the sequence of steps ○ Specified via Python SDK ● Input Parameters ○ A “Job” = Pipeline invoked w/ specific parameters

03: AI Hub at a glance All AI content in one place 1 Quick discovery of plug & play AI pipelines & other content built by teams across Google and by partners and customers. Fast & simple implementation of AI on GCP 2 One-click deployment of AI pipelines via Kubeflow on GCP as the go-to platform for AI + hybrid & on premise. Enterprise-grade internal & external sharing 3 Foster reuse by sharing deployable AI pipelines & other content privately within organizations & publicly.

Mission The one place for everything AI, from experimentation to production.

Public and private AI Hub + Private content Public content By Google By partners By customers Unique AI assets by Google Created, shared & monetized Content shared securely within and with other organizations by anyone AutoML, TPUs, Cloud AI Platform, etc.

Kubeflow Pipelines enable Workflow Rapid reliable Share, re-use & orchestration experimentation compose

Visual depiction of pipeline topology

View all current and historical runs, grouped as “Experiments”

Rich visualizations of metrics

Clone an existing pipeline

Access to all config params, inputs and outputs for each run

Update parameters and submit

Easy comparison of Runs

That’s a wrap.

S91030 - Hybrid Machine Learning with the Kubeflow Pipelines and - PowerPoint PPT Presentation

S91030 - Hybrid Machine Learning with the Kubeflow Pipelines and RAPIDS Sina Chavoshi Cloud AI Strategy: The right approach for the right problem Building blocks Platform Solutions Cloud AI Strategy: The right approach for the right

Kubeflow Anywhere Tim Van Steenburgh Engineering Manager, Canonical GTC 2019 - S9515 Zero to

Delphi: a hybrid approach to forecasting a global marketplace Machine Learning is very good at

An Exercise in An Exercise in Machine Learning Machine Learning

Machine Learning By Alex Scarlatos What is Machine Learning? Machine Learning is the process by

Machine Learning: Study of algorithms that improve their performance P at some task T

Traditional Machine Learning: Unsupervised Learning Juhan Nam Traditional Machine Learning

CS 335 Machine Learning What is Machine Learning? Dan Sheldon Spring 2019 What is Machine

Machine Learning Machine Learning: algorithms that use experience to improve their

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

1 Why Study Machine Learning? Why Study Machine Learning? Cognitive Science The Time is Ripe

MACHINE LEARNING, STATISTICAL LEARNING AND PARALLEL COMPUTING INTRODUCTION VS MACHINE LEARNING

Apache PredictionIO End-to-End Machine Learning Server with Apache Spark What is Machine

Machine Learning 11 AI Slides (6e) c Lin Zuoquan@PKU 1998-2020 11 1 11 Machine Learning

Deep Learning: Intro Juhan Nam Review of Traditional Machine Learning The traditional machine

Machine Learning for Auto Optimization What is Machine Learning? Definition: Machine

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

BBM406 Fundamentals of Machine Learning Lecture 2: Machine Learning by Examples, Nearest

Machine Learning Modeling and Learning 15-110 Monday 4/13 Learning Goals Given a

Machine Learning @ Amazon Ralf Herbrich Amazon 6/29/17 1 Overview Machine Learning in

Softmax Classifier + SGD Todays Class Intro to Machine Learning What is Machine Learning?

Neural Networks for Machine Learning Lecture 1a Why do we need machine learning? Geoffrey Hinton

Electric Machine Simulation T Electric Machine Simulation Technology chnology St Steve Har e

MACHINE LEARNING Overview 1 MACHINE LEARNING Oral Presentations of Projects Start at 9h15 am

Machine Learning for Music: Intro Juhan Nam Definition of Machine Learning Tom M. Mitchell