Enabling GPU-as-a-Service Providers with Red Hat OpenShift @jeremyeder Senior Principal Software Engineer, Red Hat March, 2018 1 JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Agenda ● OpenShift Cluster Overview ● Infrastructure Abstraction ● High Performance Features ● GPU Overview 2 JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Community Powered Innovation 3 JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
What does an OpenShift Cluster look like? ROUTING LAYER SERVICE LAYER PERSISTENT NODE NODE NODE MASTER STORAGE C C c API/AUTHENTICATION C C C DATA STORE RHEL RHEL RHEL SCHEDULER NODE NODE NODE REGISTRY C C C C HEALTH/SCALING C RED HAT ENTERPRISE LINUX RHEL RHEL RHEL PHYSICAL VIRTUAL PRIVATE PUBLIC HYBRID 4 JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Abstract away any infrastructure ● Bare Metal ● RHV ROUTING LAYER ● OpenStack ● VMware SERVICE LAYER ● GCE ● Azure PHYSICAL VIRTUAL PRIVATE PUBLIC HYBRID ● AWS ● BYO nodes... 5 JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
One Platform to... OpenShift is the single platform NFV Machine FSI to run any application: Learning ● Old or new ● Monolithic/Microservice HPC ISVs Big Data Animation 6 6 JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
High Performance RFEs by Vertical Feature FSI NFV ISV BD/ML ANIM HPC NUMA (cpuset.cpus and cpuset.mems) Yes Yes Yes Maybe Maybe Yes Device Passthrough (NIC/Disk/GPU etc...) Yes Yes Yes Maybe Maybe Yes sysctl Support (non-namespaced too) Yes Yes Yes Yes Yes Yes Separation of control- and data-plane Yes Yes Yes Yes Yes Yes Node “fitness” (extended health info) Yes Yes Maybe Maybe Maybe Yes Multi-homed pods Yes Yes Maybe Yes Yes Yes Kernel Modules (DKMS-ish) Yes Yes Maybe Maybe Yes Maybe Hugepages Yes Yes Yes Yes Maybe Maybe 7 7 JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Why do this? Enable containerization of Infrastructure Software ● Software-defined Storage and Networking ● Packet switching and routing tiers ● Multi-workloads (very different) within a single cluster ○ Layered schedulers (HPC/grid) ● Many more... 8 JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Enable containerization of Red Hat’s products ● Gluster/Container Native Storage ● Ceph ● OpenStack ● rad analytics ● KubeVirt 9 JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Upstream First: Kubernetes Working Groups ● Resource Management Working Group Features Delivered ○ Device Plugins (GPU/Bypass/FPGA) ■ CPU Manager (exclusive cores) ■ Huge Pages Support ■ Extensive Roadmap ○ Intel, IBM, Google, NVIDIA, Red Hat, many more... ● 10 JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Upstream First: Kubernetes Working Groups ● Network Plumbing Working Group Formalized Dec 2017 ○ Goal is to implement an out of tree, pseudo-standard collection of ● CRDs for multiple networks, owned by sig-network, *out of tree* Separate control- and data-plane, Overlapping IPs, Fast Data-plane ● IBM, Intel, Red Hat, Huawei, Cisco, Tigera...at least. ● 11 JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
GPU CLUSTER TOPOLOGY 12 JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
OpenShift Cluster Topology Control Plane Infrastructure LB registry registry registry master master master and and and and etcd and etcd and etcd router router router Compute Nodes and Storage Tier 13 JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
OpenShift Cluster Topology ● How to enable software to take advantage of “special” hardware ● Create Node Pools ○ Mark them as “special” ○ Taints/Tolerations ○ ExtendedResourceTole Compute Nodes... ration 14 JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
OpenShift Cluster Topology ● How to enable software to take advantage of “special” hardware ● Tune/Configure the OS ○ Tuned Profiles ○ CPU Isolation ○ sysctls Compute Nodes... 15 JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
OpenShift Cluster Topology In OpenShift, there are three “types” of sysctls Safe Unsafe Node-level Enabled by default Experimental Kubelet Flag Can’t set from a pod ● ● ● kernel.shm_rmid_forced kernel.sem* Potentially affects other ● ● ● net.ipv4.ip_local_port_range kernel.shm* pods ● ● net.ipv4.tcp_syncookies kernel.msg* Many interesting sysctls ● ● ● fs.mqueue.* Use TuneD ● ● net.* ● 16 JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
OpenShift Cluster Topology ● How to enable software to take advantage of “special” hardware ● Optimize your workload ○ Dedicate CPU cores ○ Consume hugepages Compute Nodes... 17 JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
OpenShift Cluster Topology ● How to enable software to take advantage of “special” hardware ● Enable the Hardware ○ Install drivers ○ Deploy Device Plugin Compute Nodes... 18 JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
OpenShift Cluster Topology ● How to enable software to take advantage of “special” hardware ● Consume the Device ○ KubeFlow Template deployment Compute Nodes... 19 JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Kubernetes Deployment for STAC-A2 ● CUDA 9 ● All-in-One Kubernetes Installation ● 8 x NVIDIA Tesla V100 (Volta) GPUs ● (hack/local-up-cluster.sh) ● HPE Apollo 6500 w/XL270d Gen9 ● Node labeled ● Red Hat Enterprise Linux 7.4 ● Containers: ● Kubernetes 1.8 (setup info) ○ RHEL7+CUDA9 ● nvidia-smi ○ RHEL7+CUDA9+DEVICE-PLUGIN --applications-clocks=877,1380 ○ RHEL7+CUDA9+STAC-A2 https://rhelblog.redhat.com/2017/11/21/red-hat-and-partners-deliver-new-perf ● ormance-records-on-prominent-risk-analytics-benchmark/ https://news.developer.nvidia.com/a-new-stac-a2-record/ ● 20 JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Kubernetes Deployment for STAC-A2 kubectl create Benchmark (pod) Kube Scheduler resources: limits: nvidia.com/gpu: 8 Volta GPU Kubelet Volta GPU Volta GPU Volta GPU Volta GPU Volta GPU Device Plugin Volta GPU (daemonset) Volta GPU 21 21 JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Kubernetes Deployment for STAC-A2 kubectl create Benchmark (pod) Kube Scheduler resources: limits: nvidia.com/gpu: 8 Volta GPU Kubelet Volta GPU Volta GPU Volta GPU Volta GPU Volta GPU Device Plugin Volta GPU (daemonset) Volta GPU 22 22 JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Recent GPU-related work on OpenShift ● Early KubeFlow involvement ● radanalytics templates for ML-workflow on OpenShift ● Machine-Learning OpenShift Commons ● Demo Repositories ○ https://github.com/zvonkok/nvidia-k8s ○ https://github.com/redhat-performance/openshift-psap 23 JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
THANK YOU plus.google.com/+RedHat facebook.com/redhatinc linkedin.com/company/red-hat twitter.com/RedHatNews youtube.com/user/RedHatVideos 24 JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Commoditizing GPU-as-a-Service Providers with Red Hat OpenShift Tuesday, Mar 27, 1:00 PM - 1:25 PM, Room 210E Red Hat OpenShift Container Platform, with Kubernetes at it's core, can play an important role in building flexible hybrid cloud infrastructure. By abstracting infrastructure away from developers, workloads become portable across any cloud. With NVIDIA Volta GPUs now available in every public cloud [1], as well as from every computer maker, an abstraction library like OpenShift becomes even more valuable. Through demonstrations, this session will introduce you to declarative models for consuming GPUs via OpenShift, as well as the two-level scheduling decisions that provide fast placement and stability. 25 JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Recommend
More recommend