the challenge
play

The Challenge HPC IT departments required to host Data Science - PowerPoint PPT Presentation

The Challenge HPC IT departments required to host Data Science and Machine Learning a variety of different workloads General applications supporting business processes Containerized and non-containerized Multiple ways


  1. The Challenge ▪ HPC IT departments required to host ▪ Data Science and Machine Learning a variety of different workloads ▪ General applications supporting business processes ▪ Containerized and non-containerized ▪ Multiple ways to run workloads Virtualized and non-virtualized ▪ On-premises and in the cloud ▪ Hard to find and retain skilled staff Hard to build, manage, and monitor ▪ Hard to maximize resource utilization and scale the necessary computing infrastructure up/down/out appropriately ▪ Hard to be remain flexible / agile and up to date

  2. Clou oud Adoption ion Transformation Trends Data-Int Inten ensive sive Work rkload loads Competing effectively and solving complex business problems is driving new types of workloads Compute te-In Inten ensive sive Worklo kload ads Private and public cloud are both attractive options for IT organizations Linux-based clusters are the preferred infrastructure for running advanced Clust stered ed IT infrast astruc ructu ture e workloads and private clouds prov ovid ides es the fou oundation dation

  3. Services are Converging Smart operators use The services those resources convergence to maximize support are also converging innovation, insight, and agility HPC, HP , Big Da Data, , Clou oud adoption ion and Ma Mach chin ine e Learning ing is ta table stake kes are be beco coming ng mission on-crit critica ical l Adv dvance nced d cl clust stered ed IT infrast astruc ructur ture e enabl bles es the co conver erge genc nce

  4. Are you ready?

  5. Whatever the approach, the enterprise datacenter needs a tru trust sted ed pl platf tform orm for de depl ployi oying ng , mana ma nagin ging , and mon monit itori oring ng its advanced IT IT in infr frast struc ructur ture .

  6. Wha hat t wou ould it me mean n to o you our or organ an ization if you could… • Deploy a cluster in 5 minutes? • Extend your infrastructure into the cloud with a few clicks? • Automate the de depl ploym ymen ent and management of new infrastructure? • Free up specialized staff for higher-value activities? • Retain knowledge of infrastructure management and best practices? • Spin up and tear down clustered environments in minutes ?

  7. Recommended Approach 1 Host all workloads on clusters rather than individual servers 2 Multiple workloads and execution paradigms on same cluster 3 Fast, automated re-purposing of compute resources 4 Use manual or policy-driven control 5 Automatically extend on-premises infrastructure to public cloud

  8. Introducing: Br Brig ight ht So Soft ftwa ware e Empowering the adoption of advanced clustered infrastructure for HPC , Da HP Data ta Sc Scie ienc nce , and Pr Priv ivate Cl te Clou ouds ds

  9. Bright software auto tomat ates es deploying, managing, and monitoring cl cluste tered d serv rver r infra rastr tructu cture in the data center or in the cloud

  10. Ideal for managing converged IT with multiple cluster types deployed across both physical and virtual infrastructure, on premises or in the cloud.

  11. Here’s what you can do with Bright 1 Delive iver r co comp mputi ting ng ca capacit city y fast 2 Provis vision ion 10 to to 10 10,000+ 0+ nodes from ba bare metal in minutes 3 Repurp rpose ose serv rvers rs to to acc ccomm mmod odat ate e fluct ctuat ating ing wo work rkloa loads ds on th the fly 4 Ext xtend yo your on-premises mises envi vironmen nment t to AW AWS and Az Azure Dyn ynamica ically lly 5 Automat ate e provi visioning ioning, , deployme yment, nt, and ma managemen ent

  12. Bright for Data Science makes it easy to use a Bright cluster for AI

  13. Bright for Data Science Bright for Data Science HPC Bright Clust ster er Mana nager ger GPU s GPU s GPU s GPU s GPU s GPU s GPU s GPU s GPU s GPU s Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux

  14. Without Bright • Not installable from OS repositories • Time-consuming, manual installation of deep learning libraries and frameworks • 60+ dependencies must be satisfied • Versions must all work together

  15. With Bright “This [solution] will be a powerful productivity multiplier for customers because these software modules take days to download and install if using the open source repositories.” – a Bright user

  16. With Bright: two simple commands 1 # yum install tensorflow cm-jupyterhub # yum --installroot=/cm/images/ai-image \ 2 install cm-ml-distdeps • 1 st command installs frameworks into a shared directory on the head node. It is immediately available on every node in cluster. • Yum installs all dependencies for tensorflow and cm-jupyterhub, and all the Python dependencies • 2 nd command installs all library dependencies into ai-image

  17. Cloud Bursting ▪ On-premise cluster extended with resources from public cloud ▪ Uniformity: cloud nodes look & feel ▪ Possible to do gradual same as transition to cloud on-premise nodes ▪ Single workload management ▪ Multi-cloud possible system (e.g. some jobs to AWS, ▪ Same user authentication some to Azure) ▪ Same software images used for ▪ Applications will run in provisioning cloud as if they were ▪ Same shared software running in an on-premise cluster environment (e.g. NFS applications tree, environment modules)

  18. Achieving Uniformity WORKLOAD PROVISIONING AUTHENTICATION MANAGEMENT ▪ Node-installer loaded from cloud machine image (instead of loading through ▪ Typical set-up: one job PXE) ▪ Head node runs LDAP server queue per cloud region ▪ Cloud director serves as ▪ Cloud director runs LDAP ▪ User decides whether to run provisioning node for all replica server job on-premise or in cloud nodes in particular cloud by submitting to queue region ▪ AD/external LDAP also ▪ Single queue containing all possible ▪ Cloud director receives copy nodes also possible of all software images (kept up-to-date automatically) ▪ Same kernel version

  19. REGION Y REGION X

  20. Scaling node count up/down Add/remove cloud nodes: ▪ Manually by administrator ▪ Automatically based on workload in queue using cm-scale tool cm cm-sca cale le node operations: ▪ Power on/off SC SCALE ALE ▪ Create new node (in cloud) / terminate ▪ Move to new node category (i.e. re-purpose node) ▪ Subscribe to new configuration overlay (i.e. re-purpose node) Custom policies via Python module

  21. Moving data in/out of cloud ▪ Jobs depend on input data and produce output data ▪ cm-sub allows user to specify data dependencies for jobs ▪ Job input data will be moved into cloud before job resources are allocated ▪ Data staged on temporary storage node (dynamically spun up) ▪ Job output data will be moved back to on-premises cluster ▪ Data movement is transparent to user

  22. To man anage age ad advanc anced ed IT Infrast rastruct ructur ure e for … Defense Big Data Analytics Big Data Ope penStac nStack Spark Life Science Virtual Machines Deep p Lear arning ing Energy HPC HPC Cassandra Manufacturing Academic Data Science Da ience Research Pharma NoSQL Government Machine Learning Edu duca cati tion choose

  23. What is Bright Edge? A new feature in Bright 8.2 that allows nodes of a single, centrally managed cluster to span geographic locations

  24. What is Bright Edge? Simplified deployment and management of edge compute Reduced admin time for distributed clusters Promotes standardization

  25. Customer Spotlight: Van Andel Institute "We know that cloud computing is the wave of the future. The hybrid Van n Andel del Inst stitut tute e (VAI) I) hosts ts approach we are getting thirty ty individu vidual resear search ch groups ups with Bright is providing a who o use genomic omic seque uenci ncing ng path that helps us analys alysis, s, molec ecular ular dyna ynami mics s transition. ” simulati ulation on, and d modeli eling ng to investigat estigate e epigeneti genetics cs, cancer ncer, — Zack Ramjan, Research and d neur urodegene odegenerat rative ve disea eases. ses. Computing Architect, VAI Bri right ht OpenSta nStack ck lets ts VAI I mana nage ge high-perfor performan mance ce computing puting (HPC) C) and d cloud ud computi puting ng in th the e same me infras astruc ructur ture, e, greatl atly y reducing ucing the labor or and d effort t needed eded for mana nageme gement nt and d change ange contr trol ol.

Recommend


More recommend