BenchCouncil: Present and Future Prof. Dr. Jianfeng Zhan BenchCouncil http://www.benchcouncil.org 2019.11.14
BenchCouncil • International non-profit benchmark organization • Executive Committee • Prof. D. K. Panda, the Ohio State University • Prof. Lizy Kurian John, the University of Texas at Austin • Prof. Geoffrey Fox, Indiana University • Prof. Vijay Janapa Reddi, Harvard University • Prof. Jianfeng Zhan, ICT, Chinese Academy of Sciences, University of Chinese Academy of Sciences (Chair)
Yong but Fast-growing • Founded in 2018 • 60+ Full and associate international memberships • Several top Internet service providers and high performance computing centers • http://www.benchcouncil.org/organization.html
Influential • Three conferences • International Symposium on Intelligent Computers • 6.27-29, Shenzhen, China • 1000+ attendees • International Symposium on Benchmarking, Measuring, and Optimization • 11.14-16,Denver, US • International Symposium on Chips • 12.18-20, Beijing, China • 40+ high-level policy makers in China will attend this symposium
AI Systems and Algorithms Challenges • http://www.benchcouncil.org/competitions.html • 500K RMB, 2019 • Using AIBench • Four tracks • Systems • Cambricon • RISC-V • X86 • 3D face recognition challenge • Competitors from Top university and company • Chinese Academy of Sciences • Shanghai Jiaotong University • Google • Ohio State University
Award • http://www.benchcouncil.org/html/awards.html • BenchCouncil achievement award • BenchCouncil Fellow • Best paper award
Testbed • http://www.benchcouncil.org/testbed.html • Host 2019 BenchCouncil International AI system and algorithm Challenges • Provide container-based benchmarks images • Provide pre-trained AI models.
Numbers • Report big data and AI performance numbers. • http://www.benchcouncil.org/numbers.html
Organization Evolution • Steering Committee • Executive Committee • Track Steering Committee • Big Data • Datacenter, HPC, IoT, and Edge AI • Track Executive Committee
Conference Changes • BenchCouncil International Symposium on Intelligent Computers • BenchCouncil Intelligent Computing Federated Conferences • Intelligent Computers • Smart Health • Smart Finance and Chain Block Systems • Education Technologies •
Outline • Summary of BenchCouncil Work • BenchCouncil’s Viewpoints on Benchmarking AI and Other Emerging Workloads
A New Golden Age for Computer Architecture— Domain-specific Co-design § Only path left is Domain Specific Architectures § (Forrest Gump) Just do a few tasks, but extremely well § Fundamental Changes in Technology § Ending of Moore’s Law § End of Dennard Scaling § ILP limitation and inefficiency § Amdahl’s Law John Hennessy and David Patterson A.M. TURING AWARD WINNERS
Domain-specific Co-design is Totally Not New ! • The first computer is domain-specific • not general-purpose • Few specific tasks • Indeed use benchmarks • Machine language • Even without an OS 13
HPC: Domain-specific Co-design Flagship • FLOPS • Benchmarks • HPCC (Linpack) • OS • Eliminate OS noises • Communication • RDMA • Programming: MPI 14
Co-designing Everything is Brand-new ! • A big application can afford the co-design cost • Google, Alibaba, Facebook, WeChat ……
The Landscape of Modern Workloads § Big Data § Machine learning (AI) § Internet services § Different application scenarios § IoT, Edge, Datacenter, HPC § Ideal target for co-design
Server-side Big Data, ML, Internet Service HPC only takes 20% market share
(Hardware) Bad News • Find a workload (from Google), just do it. • Architecture conferences become accelerator ones. • Engineers have to put more ( 1000 ) accelerators in one node.
Bad News! • Abstractions are abandoned • Ad-hoc solutions everywhere!
Big Data Landscape
AI Chips • AI Inference Chips • 100+ • AI Training Chips • 10+
Fundamental Challenges • Lack simple but elegant abstractions that help achieve both efficiency and general-purpose! • Single-purpose is a structure obstacle to resource sharing
• Looking back at History!
Database - Relational Algebra • Relational Algebra Select • Five primitive and Project fundamental operators • Theoretical foundation of database Product • Strong expression power Union • Compose complex queries Difference From E. F. Codd, A relational Model of Data for Large shared data banks. Communication of ACM, vol 13. no.6, 1970
Numerical Method • Seven motifs would be important for the next decade 7“Motifs” Structured Unstructure Grids FFT d Grids • Phillip Colella proposed Dense • Simulation in the physical linear Particles algebra sciences is done out using various Sparse Monte linear combinations of the Carlo algebra following core algorithms From P. Colella, “Defining software requirements for scientific computing,” 2004.
Parallel Computing • Landscape of Parallel Computing Research 13 dwarfs Backtrack Dynamic Unstructure Structu- N-Body and branch programm d Grids red Grids method • Berkeley research group bound ing • Define building blocks for creating libraries & Dense Graph Spectral Combination linear models frameworks method logic algebra • A pattern of computation and communication Sparse Finite Monte Graph linear state Carlo traversal algebra machine From K. Asanovic, R. Bodik, B. C. Catanzaro, J. J. Gebis, P. Husbands, K. Keutzer, D. A. Patterson, et al, “The landscape of parallel computing research: A view from berkeley,” tech. rep., Technical Report UCB/EECS-2006-183, EECS Department, University of California, Berkeley, 2006.
Other Challenges • Totally isolated • Workload churns • SaaS • Microservice-based architecture • ML models updated frequently • Open-source components are not the best!
Understand Essentials of Workloads • The common requirements are specified only algorithmically in a paper-and pencil approach ( NAS parallel benchmarks ) • Reasonably divorced from individual implementations
Complexity of Modern Workloads • The common requirements are handled differently or even collaboratively by datacenter, edge, and devices.
Essentials of Big Data, AI and Internet Services Workloads • Treat big data, AI and Internet service workloads as a pipeline of units of computation handling (input or intermediate) data • Target: find the main abstractions of time-consuming units of computation ( data motifs ) • The combination of data motifs = complex workloads • Similar to Relational Algebra Wanling Gao, Jianfeng Zhan, Lei Wang, et al. Data Motif: A Lens towards Fully Understanding Big Data and AI Workloads. PACT 2018.
Basic Methodology
Algorithms with a Broad Spectrum Ø Internet services Ø Data mining/Machine learning Ø Natural language processing/Computer vision ( Recognition Sciences ) Ø Bioinformatics ( Medical Sciences )
Our Observations: Eight Data Motifs Matrix Sampling Transform Graph Logic Set Statistics Sort Gao, Wanling, et al. "Data motifs: a lens towards fully understanding big data and AI workloads." Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques. ACM, 2018.
Expression Power of Eight Data Motifs • Using the combination of data motifs to represent a wide variety of big data and AI workloads Big Data and AI Motif Ø Coverage of Combinations with Ø Matrix different weights fundamental units of Ø Sampling computation Diverse big Ø Transform Ø Provide the data and AI Ø Graph methodology of workloads Ø Logic choosing typical Ø Set workloads Ø Statistics Ø Reduce workload Ø Sort redundancy
Data Motifs 'Differences from Kernels • behaviors are affected by the sizes, patterns, types, and sources of different data inputs • reflect not only computation patterns, memory access patterns, but also disk and network I/O patterns
Domain-specific Hardware and Software Co-design § Ad-hoc solution § Case by case § Structure solution § Tailoring the system and architecture to characteristics of data motifs Ø New architecture/accelerator design Ø Data motif-based libraries Ø Bottleneck identification and optimization
Scalable Benchmark Methodology • Traditional : create each benchmark or proxy for every possible workload • Our: Data motif-based (Scalable) • Micro Benchmark--- Single data motif • Component Benchmark--- Data motif combination with different weights • Application Benchmark--- End-to-end application
Data Motif-based proxy benchmarks • A DAG-like combination of data motifs • An auto-tuning tool using machine learning model • Mimic system and micro-architectural behaviors
BenchCouncil Benchmarks • http://www.benchcouncil.org/benchmarks.html • BigDataBench • AIBench • HPC AI500 • Edge AIBench • AIoT Bench • BENCHCPU
Recommend
More recommend