AI and Predictive Analytics in Data-Center Environments Distributed - PowerPoint PPT Presentation

AI and Predictive Analytics in Data-Center Environments Distributed Computing using Spark Distributing Neural Networks using Spark and Intel BigDL Josep Ll. Berral @BSC Intel Academic Education Mindshare Initiative for AI

Introduction “There are solutions for distributing Deep Learning, and some optimized for leveraging specific computing architectures”

Deep Learning • Multiple Layers on Neural Networks • Model abstract patterns on data • … on different levels • … with several “stages” h1 h2 h4 h5 h3 “cat” Input Output Simpler patterns Complex patterns on image on image

Deep Learning • “How to distribute those stages?” • Do we distribute layers? • How do training adjustments communicate? h1 h2 h3 h4 h5 Input Output Data Results

Deep Learning • “How to distribute those stages?” • Do we distribute data? • How do we distribute models? • How do we aggregate models? h1 h2 h5 h3 h4 D1 Input Output h1 h2 h3 h4 h5 D2 Input Data Output h1 h2 h3 h4 h5 D3 Input Output

Deep Learning • “How to distribute those stages?” • Do we distribute …stuff? • Tensor units • FPGAs • GPUs • Accelerators h1 h2 h3 h4 h5 Data Input Output

Architecture Awareness • Platform Aware of Architecture • A platform requires High Performance • Machinery/Hardware/Components companies: • … create versions/libraries for such platforms • … optimize them for their hardware • … offer them as an optimized alternative • This is usual at Intel • E.g. Compilers • GCC (GNU multi- processor) → ICC (Optimized for Intel processors)

Intel BigDL • Tensor processing platforms • E.g. Torch • … popular platform for tensor processing • … has a common syntax for programming (+python) • … oriented to Neural Networks and Deep Learning • … can be mounted on Spark offering DL functionalities • Intel BigDL • Uses a syntax identical to Torch • … it is optimized for Intel technologies • Intel MKL (Math Kernel Library) • Multi-Threading enabled • … it is provided as Spark libraries • … tries to integrate better into Spark distribution

Intel BigDL • Shuffling Minimization • Beware of communication between workers → Minimize! D D D D ... Master Master Master D D Workers Workers Manages the Shuffling D D D D D D ... Master D D D Workers Workers Workers Direct Sharing of Data

Programming a NN • Spark + BigDL 1. Load the BigDL libraries and components If we loaded spark.ml elements, now we load bigdl elements • from bigdl.nn.layer import * 2. Define Layers E.g. a NN with 1 Hidden layer, with 5 input features, 2 output classes, and a Log SoftMax • lr_seq = Sequential() lr_seq.add(Linear(5, 2)) lr_seq.add(LogSoftMax()); 3. Define the Optimizer from bigdl.nn.criterion import * from bigdl.optim.optimizer import * optimizer = Optimizer( model = lr_seq, training_rdd = train_rdd, criterion = ClassNLLCriterion(), end_trigger = MaxEpoch(20), optim_method = SGD(learningrate=0.05), batch_size = 16)

Programming a NN • Spark + BigDL 4. Set some validation optimizer.set_validation( batch_size = 16, val_rdd = validation_rdd, trigger = EveryEpoch(), val_method = [Loss()]) 5. Then fit the model optimizer.optimize(); 6. Also perform evaluation test_results = lr_seq.evaluate( test_rdd, batch_size = 16, [Loss()])

Programming a NN Architecture Example: • from bigdl.nn.layer import * num_hidden = [10, 50, 100] num_classes = 3 ff_seq = Sequential() ff_seq.add(Linear(num_features, num_hidden[0])) ff_seq.add(ReLU()) ff_seq.add(Linear(num_hidden[0], num_hidden[1])) ff_seq.add(ReLU()) ff_seq.add(Linear(num_hidden[1], num_hidden[2])) ff_seq.add(ReLU()) ff_seq.add(Linear(num_hidden[2], num_classes)) ff_seq.add(LogSoftMax()); Corresponding to: Linear + ReLU Logit + SoftMax • h1 h2 h3 out (num_features) Class 1 Input ... ... ... Class 2 Class 3 n = 10 n = 50 n = 100

Hands-On • Next, let’s move to the Hands-On • Play with BigDL • See some NN examples

Summary • Distributing Neural Networks • Tensor processing frameworks • Intel BigDL framework • Architecture Aware and Optimization • Programming a Deep Neural Network • Training • Evaluation • Induction

AI and Predictive Analytics in Data-Center Environments Distributed - PowerPoint PPT Presentation

AI and Predictive Analytics in Data-Center Environments Distributed Computing using Spark Distributing Neural Networks using Spark and Intel BigDL Josep Ll. Berral @BSC Intel Academic Education Mindshare Initiative for AI Introduction

Session 3 Upskilling for Predictive Analytics Travis M Short, FSA Upskilling for Predictive

Analytics and Data Summit 2020 Analytics and Data Summit 2020 Analytics and Data Summit 2020

Predictive Analytics for Capacity Planning HIC 2015 Andrae Gaeth What is predictive

Automating Predictive Analytics www.xpanseanalytics.com Agenda Predictive Analytics vs

Educational Predictive Analytics: Navigating Disparate Views Aaron Springer , Victoria Chou,

COVID-19 Predictive Analytics April 8th, 2020 Predictive Analytics Focus Areas Health System

Session 2 Predictive Analytics in Policyholder Behavior Eileen Burns, FSA, MAAA David Wang, FSA,

AI and Predictive Analytics in Data-Center Environments Distributed Computing using Spark An

AI and Predictive Analytics in Data-Center Environments Data Science and Engineering Josep Ll.

Architecture 3.0 Landscape Analytics Jrgen Dllner Hasso-Plattner-Institut Jrgen

AI and Predictive Analytics in Data-Center Environments Supervised Learning Methods Josep Ll.

Model Predictive Control Model Predictive Control of Hybrid Systems of Hybrid Systems Model

Predictive Simulation & Big Data Analytics ISD Analytics Predict a better future

AI and Predictive Analytics in Data-Center Environments Neural Networks and Deep Learning Josep

AI and Predictive Analytics in Data-Center Environments Introduction to Machine Learning Josep

AI and Predictive Analytics in Data-Center Environments Distributed Computing using Spark

Germn Llort gllort@bsc.es >10k processes + long runs = large traces Blind tracing is

Parallel programming using OpenMP Computer Architecture J. Daniel Garca Snchez (coordinator)

Facilities Master Planning Hickman Mills School District Public Meeting February 12, 2019

What Well Cover Immigration Law Basics B-1 / B-2 - Visitors F-1 Students R-1

Health Reform Implementation Thursday, July 21, 2011 This webcast will begin at 2:00 P.M. EDT

Multiparty Asynchronous Session Types http://mrg.doc.ic.ac.uk/ Nobuko Yoshida Imperial College

SDN in CloudStack Tuesday, October 15, 13 About me Hugo Trippaers Email:

Master Theorem Carola Wenk Slides courtesy of Charles Leiserson with changes and additions by

AI and Predictive Analytics in Data-Center Environments Distributed - PowerPoint PPT Presentation

AI and Predictive Analytics in Data-Center Environments Distributed Computing using Spark Distributing Neural Networks using Spark and Intel BigDL Josep Ll. Berral @BSC Intel Academic Education Mindshare Initiative for AI Introduction

Session 3 Upskilling for Predictive Analytics Travis M Short, FSA Upskilling for Predictive

Analytics and Data Summit 2020 Analytics and Data Summit 2020 Analytics and Data Summit 2020

Predictive Analytics for Capacity Planning HIC 2015 Andrae Gaeth What is predictive

Automating Predictive Analytics www.xpanseanalytics.com Agenda Predictive Analytics vs

Educational Predictive Analytics: Navigating Disparate Views Aaron Springer , Victoria Chou,

COVID-19 Predictive Analytics April 8th, 2020 Predictive Analytics Focus Areas Health System

Session 2 Predictive Analytics in Policyholder Behavior Eileen Burns, FSA, MAAA David Wang, FSA,

AI and Predictive Analytics in Data-Center Environments Distributed Computing using Spark An

AI and Predictive Analytics in Data-Center Environments Data Science and Engineering Josep Ll.

Architecture 3.0 Landscape Analytics Jrgen Dllner Hasso-Plattner-Institut Jrgen

AI and Predictive Analytics in Data-Center Environments Supervised Learning Methods Josep Ll.

Model Predictive Control Model Predictive Control of Hybrid Systems of Hybrid Systems Model

Predictive Simulation &amp; Big Data Analytics ISD Analytics Predict a better future

AI and Predictive Analytics in Data-Center Environments Neural Networks and Deep Learning Josep

AI and Predictive Analytics in Data-Center Environments Introduction to Machine Learning Josep

AI and Predictive Analytics in Data-Center Environments Distributed Computing using Spark

Germn Llort gllort@bsc.es &gt;10k processes + long runs = large traces Blind tracing is

Parallel programming using OpenMP Computer Architecture J. Daniel Garca Snchez (coordinator)

Facilities Master Planning Hickman Mills School District Public Meeting February 12, 2019

What Well Cover Immigration Law Basics B-1 / B-2 - Visitors F-1 Students R-1

Health Reform Implementation Thursday, July 21, 2011 This webcast will begin at 2:00 P.M. EDT

Multiparty Asynchronous Session Types http://mrg.doc.ic.ac.uk/ Nobuko Yoshida Imperial College

SDN in CloudStack Tuesday, October 15, 13 About me Hugo Trippaers Email:

Master Theorem Carola Wenk Slides courtesy of Charles Leiserson with changes and additions by

Predictive Simulation & Big Data Analytics ISD Analytics Predict a better future

Germn Llort gllort@bsc.es >10k processes + long runs = large traces Blind tracing is