performance modeling of high performance computing hpc
play

Performance Modeling of High Performance Computing [HPC] S. - PowerPoint PPT Presentation

Performance Modeling of High Performance Computing [HPC] S. Amirhossein Abtahizadeh March 18 th , 2016 Section 1 Failed Attempt 2 Section 1 Ontological Performance Modeling 3 Section 1 Ontology Internet Of Things (IoT) Semantic Web Web


  1. Performance Modeling of High Performance Computing [HPC] S. Amirhossein Abtahizadeh March 18 th , 2016

  2. Section 1 Failed Attempt 2

  3. Section 1 Ontological Performance Modeling 3

  4. Section 1 Ontology Internet Of Things (IoT) Semantic Web Web of Data 4

  5. Section 1 Web of Data Web 1.0: HTTP Web 2.0: Social Networks Web 3.0: Web of Data 5

  6. Section 1 Web of Data Collections of data Distributed among machines How to present data? 6

  7. Section 1 <?xml version="1.0"?> <!DOCTYPE Ontology [ <!ENTITY xsd "http://www.w3.org/XMLSchema#" > ]> <owlx:Ontology owlx:name="http://www.example.org/wine" xmlns:owlx="http://www.w3.org/2003/05/owl-xml"> </owlx:Ontology> 7

  8. Section 1 Web of Data Resource Description Framework (RDF) Query another dataset (OWL) To understand what is “Wine”? 8

  9. Section 1 Semantic Web Interconnected Ontologies Languages: RDF, XML, Turtle Query: SPARQL 9

  10. Section 1 Semantic Web Google: “polymtl gigl professors” 10

  11. Section 1 Semantic Web What about: “How many professors at polymtl are working on research topic cloud computing now?” 11

  12. Section 1 SPARQL For machines! Not human 12

  13. Section 1 PREFIX ex: <http://example.com/exampleOntology#> SELECT ?capital ?country WHERE { ?x ex:cityname ?capital ; ex:isCapitalOf ?y . ?y ex:countryname ?country ; ex:isInContinent ex:Africa . } 13

  14. Section 1 Internet of Things (IoT) Semantic Web Infrastructure Interconnected machines Raspberry PI* 14 * h.ps://www.raspberrypi.org

  15. Section 1 Research Objective: Present “performance” with Ontology 15

  16. Section 1 Research Objective: Share common understanding of Application Performance 16

  17. Section 1 Research Objective: Tackle the ambiguity of WHAT IS PERFORMANCE? Nice and clear description in terms of OWL XML fields 17

  18. Section 1 Define quality attributes (What is the availability of the system?) QA 1 QA 2 QA 3 QA 4 QA 5 18

  19. Section 1 Research Methodology: Develop a Cloud-based app Define scenarios Measure performance with OWL Using logical axioms 19

  20. Section 1 Axioms: Inference Response time < 2ms à P = 90% 20

  21. Section 1 Building Ontology: 100+ axioms Cascaded classification 10+ inferred rules 21

  22. Section 1 22

  23. Section 1 23

  24. Section 1 FAILED! OWL/XML is not efficiently designed Performance is subjective 24

  25. Section 2 Performance and Energy Modeling of High Performance Computing (HPC) 25

  26. Section 2 What is HPC? Parallel processing Advanced applications Massive computations Scientific programs 26

  27. Section 2 What is HPC? Super fast transactions Distributed algorithms Message Passing Interface (MPI) 27

  28. Section 2 In the domain of HPC We deal with CPU-intensive apps Data might be just an array! Computations might be exponential. 28

  29. Section 2 Aggregate computer powers Clustering at very large-scale “Computing at the speed of innovation!”* * IBM (www.ibm.com 29

  30. Section 2 Message Passing Interface (MPI) 30

  31. Section 2 MPI is a Library To write parallel programs It provides collective functions 31

  32. Section 2 MPI is available in many languages: C, C++, Java, Python, R 32

  33. Section 2 • When we have networking libraries, why bother using MPI ?! • Optimized for performance • Fastest network transport found • Within a computer: MPI will use shared memory (not network!) • Fast cluster interconnects: MPI will use Infinibands, … • Enforces guarantees (reliable messages, In-Order) • Think about the problem, forget about the network 33

  34. Section 2 Research Objective: Given a set of input variables: Network bandwidth, CPU power, Throughput, Disk speed, Memory, … What is the optimized configuration for the best performance/energy achieved? 34

  35. Section 2 Def (Performance && Energy): Return Multi-objective Optimization 35

  36. Section 2 Why? • Efficient resource provisioning (what to choose?) • Predict the changes in your system (what will happen if.. ?) • Performance becomes part of the design • Itemized scenarios (what is important?) • Avoid surprises with performance when deploying Enterprise reputation (risk management!) 36

  37. Section 2 How? 37

  38. Section 2 ScienDfic Dataset ApplicaDon Model Benchmark MPI Master 38 Node 1 Node 2 Node 3 Node 100

  39. Section 2 Architecture Not yet accessible! 100 nodes ORACLE Solaris Cluster ORACLE VirtualBox 39

  40. Section 2 Test Architecture Digital Ocean Cloud Platform 10 nodes ORACLE Solaris Cluster MPI4PY 40

  41. Section 2 Scientific Application Schaffer problem 41

  42. Section 2 Run time Your laptop: life time HPC XT-Cluster: less than a minute 42

  43. Section 2 from platypus.algorithms import NSGAII from platypus.core import Problem, evaluator from platypus.types import Real class Schaffer(Problem): def __init__(self): super(Schaffer, self).__init__(1, 2) self.types[:] = Real(-10, 10) @evaluator def evaluate(self, solution): x = solution.variables[:] solution.objectives[:] = [x[0]**2, (x[0]-2)**2] algorithm = NSGAII(Schaffer()) algorithm.run(10000) 43

  44. Section 2 Methodology Memetic Algorithm Recurrent Neural Network Prediction Model 44

  45. Section 2 Memetic Algorithm Genomes: Set of observed values of Performance & Energy 45

  46. Section 2 Memetic Algorithm Genomes: NumPy array 46

  47. Section 2 Memetic Algorithm Fitness Function: Schaffer optimization 47

  48. Section 2 Memetic Algorithm Cross-Over: Alternating-position Operator 48

  49. Section 2 Recurrent Neural Network Bi-directional data Both past and future 49

  50. Section 2 Prediction Model Compare with benchmarks Linear trend estimation Least Square Error 50

  51. Section 2 Measuring Energy Consumption Power-API Physical devices 51

  52. Section 2 Correlation? Between performance & energy Coefficient: +0.217 52

  53. Section 2 • s 53

  54. Section 2 • Trend Estimation: 83.59% Per slice of one separate run Noise? I’m working on it .. 54

  55. Conclusion Multi-objective optimization of Performance & Energy In High Performance Computing 55

  56. Conclusion Predict performance and energy SAVE MONEY! Show that this approach is scalable 56

  57. Conclusion Cloud resource selection 57

Recommend


More recommend