benchcouncil present and future
play

BenchCouncil: Present and Future Prof. Dr. Jianfeng Zhan - PowerPoint PPT Presentation

BenchCouncil: Present and Future Prof. Dr. Jianfeng Zhan BenchCouncil http://www.benchcouncil.org 2019.11.14 BenchCouncil International non-profit benchmark organization Executive Committee Prof. D. K. Panda, the Ohio State


  1. BenchCouncil: Present and Future Prof. Dr. Jianfeng Zhan BenchCouncil http://www.benchcouncil.org 2019.11.14

  2. BenchCouncil • International non-profit benchmark organization • Executive Committee • Prof. D. K. Panda, the Ohio State University • Prof. Lizy Kurian John, the University of Texas at Austin • Prof. Geoffrey Fox, Indiana University • Prof. Vijay Janapa Reddi, Harvard University • Prof. Jianfeng Zhan, ICT, Chinese Academy of Sciences, University of Chinese Academy of Sciences (Chair)

  3. Yong but Fast-growing • Founded in 2018 • 60+ Full and associate international memberships • Several top Internet service providers and high performance computing centers • http://www.benchcouncil.org/organization.html

  4. Influential • Three conferences • International Symposium on Intelligent Computers • 6.27-29, Shenzhen, China • 1000+ attendees • International Symposium on Benchmarking, Measuring, and Optimization • 11.14-16,Denver, US • International Symposium on Chips • 12.18-20, Beijing, China • 40+ high-level policy makers in China will attend this symposium

  5. AI Systems and Algorithms Challenges • http://www.benchcouncil.org/competitions.html • 500K RMB, 2019 • Using AIBench • Four tracks • Systems • Cambricon • RISC-V • X86 • 3D face recognition challenge • Competitors from Top university and company • Chinese Academy of Sciences • Shanghai Jiaotong University • Google • Ohio State University

  6. Award • http://www.benchcouncil.org/html/awards.html • BenchCouncil achievement award • BenchCouncil Fellow • Best paper award

  7. Testbed • http://www.benchcouncil.org/testbed.html • Host 2019 BenchCouncil International AI system and algorithm Challenges • Provide container-based benchmarks images • Provide pre-trained AI models.

  8. Numbers • Report big data and AI performance numbers. • http://www.benchcouncil.org/numbers.html

  9. Organization Evolution • Steering Committee • Executive Committee • Track Steering Committee • Big Data • Datacenter, HPC, IoT, and Edge AI • Track Executive Committee

  10. Conference Changes • BenchCouncil International Symposium on Intelligent Computers • BenchCouncil Intelligent Computing Federated Conferences • Intelligent Computers • Smart Health • Smart Finance and Chain Block Systems • Education Technologies •

  11. Outline • Summary of BenchCouncil Work • BenchCouncil’s Viewpoints on Benchmarking AI and Other Emerging Workloads

  12. A New Golden Age for Computer Architecture— Domain-specific Co-design § Only path left is Domain Specific Architectures § (Forrest Gump) Just do a few tasks, but extremely well § Fundamental Changes in Technology § Ending of Moore’s Law § End of Dennard Scaling § ILP limitation and inefficiency § Amdahl’s Law John Hennessy and David Patterson A.M. TURING AWARD WINNERS

  13. Domain-specific Co-design is Totally Not New ! • The first computer is domain-specific • not general-purpose • Few specific tasks • Indeed use benchmarks • Machine language • Even without an OS 13

  14. HPC: Domain-specific Co-design Flagship • FLOPS • Benchmarks • HPCC (Linpack) • OS • Eliminate OS noises • Communication • RDMA • Programming: MPI 14

  15. Co-designing Everything is Brand-new ! • A big application can afford the co-design cost • Google, Alibaba, Facebook, WeChat ……

  16. The Landscape of Modern Workloads § Big Data § Machine learning (AI) § Internet services § Different application scenarios § IoT, Edge, Datacenter, HPC § Ideal target for co-design

  17. Server-side Big Data, ML, Internet Service HPC only takes 20% market share

  18. (Hardware) Bad News • Find a workload (from Google), just do it. • Architecture conferences become accelerator ones. • Engineers have to put more ( 1000 ) accelerators in one node.

  19. Bad News! • Abstractions are abandoned • Ad-hoc solutions everywhere!

  20. Big Data Landscape

  21. AI Chips • AI Inference Chips • 100+ • AI Training Chips • 10+

  22. Fundamental Challenges • Lack simple but elegant abstractions that help achieve both efficiency and general-purpose! • Single-purpose is a structure obstacle to resource sharing

  23. • Looking back at History!

  24. Database - Relational Algebra • Relational Algebra Select • Five primitive and Project fundamental operators • Theoretical foundation of database Product • Strong expression power Union • Compose complex queries Difference From E. F. Codd, A relational Model of Data for Large shared data banks. Communication of ACM, vol 13. no.6, 1970

  25. Numerical Method • Seven motifs would be important for the next decade 7“Motifs” Structured Unstructure Grids FFT d Grids • Phillip Colella proposed Dense • Simulation in the physical linear Particles algebra sciences is done out using various Sparse Monte linear combinations of the Carlo algebra following core algorithms From P. Colella, “Defining software requirements for scientific computing,” 2004.

  26. Parallel Computing • Landscape of Parallel Computing Research 13 dwarfs Backtrack Dynamic Unstructure Structu- N-Body and branch programm d Grids red Grids method • Berkeley research group bound ing • Define building blocks for creating libraries & Dense Graph Spectral Combination linear models frameworks method logic algebra • A pattern of computation and communication Sparse Finite Monte Graph linear state Carlo traversal algebra machine From K. Asanovic, R. Bodik, B. C. Catanzaro, J. J. Gebis, P. Husbands, K. Keutzer, D. A. Patterson, et al, “The landscape of parallel computing research: A view from berkeley,” tech. rep., Technical Report UCB/EECS-2006-183, EECS Department, University of California, Berkeley, 2006.

  27. Other Challenges • Totally isolated • Workload churns • SaaS • Microservice-based architecture • ML models updated frequently • Open-source components are not the best!

  28. Understand Essentials of Workloads • The common requirements are specified only algorithmically in a paper-and pencil approach ( NAS parallel benchmarks ) • Reasonably divorced from individual implementations

  29. Complexity of Modern Workloads • The common requirements are handled differently or even collaboratively by datacenter, edge, and devices.

  30. Essentials of Big Data, AI and Internet Services Workloads • Treat big data, AI and Internet service workloads as a pipeline of units of computation handling (input or intermediate) data • Target: find the main abstractions of time-consuming units of computation ( data motifs ) • The combination of data motifs = complex workloads • Similar to Relational Algebra Wanling Gao, Jianfeng Zhan, Lei Wang, et al. Data Motif: A Lens towards Fully Understanding Big Data and AI Workloads. PACT 2018.

  31. Basic Methodology

  32. Algorithms with a Broad Spectrum Ø Internet services Ø Data mining/Machine learning Ø Natural language processing/Computer vision ( Recognition Sciences ) Ø Bioinformatics ( Medical Sciences )

  33. Our Observations: Eight Data Motifs Matrix Sampling Transform Graph Logic Set Statistics Sort Gao, Wanling, et al. "Data motifs: a lens towards fully understanding big data and AI workloads." Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques. ACM, 2018.

  34. Expression Power of Eight Data Motifs • Using the combination of data motifs to represent a wide variety of big data and AI workloads Big Data and AI Motif Ø Coverage of Combinations with Ø Matrix different weights fundamental units of Ø Sampling computation Diverse big Ø Transform Ø Provide the data and AI Ø Graph methodology of workloads Ø Logic choosing typical Ø Set workloads Ø Statistics Ø Reduce workload Ø Sort redundancy

  35. Data Motifs 'Differences from Kernels • behaviors are affected by the sizes, patterns, types, and sources of different data inputs • reflect not only computation patterns, memory access patterns, but also disk and network I/O patterns

  36. Domain-specific Hardware and Software Co-design § Ad-hoc solution § Case by case § Structure solution § Tailoring the system and architecture to characteristics of data motifs Ø New architecture/accelerator design Ø Data motif-based libraries Ø Bottleneck identification and optimization

  37. Scalable Benchmark Methodology • Traditional : create each benchmark or proxy for every possible workload • Our: Data motif-based (Scalable) • Micro Benchmark--- Single data motif • Component Benchmark--- Data motif combination with different weights • Application Benchmark--- End-to-end application

  38. Data Motif-based proxy benchmarks • A DAG-like combination of data motifs • An auto-tuning tool using machine learning model • Mimic system and micro-architectural behaviors

  39. BenchCouncil Benchmarks • http://www.benchcouncil.org/benchmarks.html • BigDataBench • AIBench • HPC AI500 • Edge AIBench • AIoT Bench • BENCHCPU

Recommend


More recommend