HPC-SIG Ecosystem Validation Jan. 14 2019 Baptiste Gerondeau Renato Golin
HPC-SIG Lab and Validation Matrix Aggregate machines in the same infrastructure, and validate their performance using a Validation Matrix Validation Matrix must be applicable to every machine ● Validation Matrix dimensions are software configurations ● To generate as few tests as possible, we need to simplify the matrix without losing information For more info visit linaro.org/hpc
HPC-SIG Lab’s Infrastructure The infrastructure needs to : Dispatch jobs (tests, provisioning, benchmarks) ● Provide DHCP/TFTP services ● Provide Package Cache services ● Provide a secure file/results storage service ● Be Low Maintenance ● Be able to be replicated anywhere else ● For more info visit linaro.org/hpc
Simplifying Infrastructure Identifying the different dimensions A Vertical Slice of the Stack Principal dimensions : Application ➔ ➔ HPC environment stack Machine provisioning ➔ ● HPC Stack : OpenHPC ● Validation Application : OpenHPC’s testsuite For more info visit linaro.org/hpc
Simplifying Infrastructure Identifying the different dimensions The Stack from the Lab’s point of view Machine provisioning : Network configuration ➔ Kernel ➔ OS ➔ HPC Stack ➔ ● Multiple ways to do the provisioning For more info visit linaro.org/hpc
Simplifying Infrastructure Identifying the different dimensions Provisioning Method Variations Multiple ways to provision : Warewulf Stateless (VNFS) ➔ ➔ Warewulf Stateful (OS image) Ansible ➔ For more info visit linaro.org/hpc
Simplifying Infrastructure Identifying the different dimensions Different Network Layouts ● Flat : Machines reachable from anywhere ● Tree: Machines reachable from cluster head node only ● Root : Master with DHCP/TFTP server For more info visit linaro.org/hpc
Simplifying Infrastructure Identifying the different dimensions Different Kernels ● Upstream from OS ● ERP : Entreprise Reference Platform ● Contains support for platforms in the process of being upstreamed For more info visit linaro.org/hpc
Simplifying Infrastructure Identifying the different dimensions Different Operating Systems ● 3 OSes available to the user ● No Debian support in OpenHPC For more info visit linaro.org/hpc
Simplifying Infrastructure Abstractions, and the user’s environment Abstracting Network Variations ● Invisible to the user ● Handled by the lab installer ● Dependent on hardware For more info visit linaro.org/hpc
Simplifying Infrastructure Abstractions, and the user’s environment Abstracting Provisioning Variations ● Multi-staged provisioning ● Coexistence ● Dependent on hardware For more info visit linaro.org/hpc
Simplifying Infrastructure Abstractions, and the user’s environment Abstracting Environment Variations ● Control over HPC Stack ● Common OS configuration ● Idempotency ● Package Caches For more info visit linaro.org/hpc
Simplifying Infrastructure Abstractions, and the user’s environment Accounting for extra HPC services ● Infiniband Support ● Lustre server support ● Future additional features (additional hardware) For more info visit linaro.org/hpc
Simplifying Infrastructure What the User sees, configures The Lab’s Interface Choose Application ➔ Lab picks default configuration ❖ ❖ User fine tunes configuration For more info visit linaro.org/hpc
Validation matrix Cluster Deployment For more info visit linaro.org/hpc
Validation matrix Distributed Applications Enablement For more info visit linaro.org/hpc
Validation matrix Toolchain Benchmarking For more info visit linaro.org/hpc
Validation matrix Library Enablement and Enhancement For more info visit linaro.org/hpc
Future Vendors to rely on Linaro for base OSS validation ● We have multiple vendors available ○ On a standardised infrastructure ○ For more info visit linaro.org/hpc
Future Vendors to rely on Linaro for base OSS validation ● We have multiple vendors available ○ On a standardised infrastructure ○ Share our work ● OpenHPC Ansible recipes (with the OpenHPC community) ○ SDI (MrP, Jenkins, Ansible) helping members to replicate our work ○ Community CI (OpenHPC test-suite, MPI MTT, OpenMP tests, OpenBLAS CI) ○ For more info visit linaro.org/hpc
Future Vendors to rely on Linaro for base OSS validation ● We have multiple vendors available ○ On a standardised infrastructure ○ Share our work ● OpenHPC Ansible recipes (with the OpenHPC community) ○ SDI (MrP, Jenkins, Ansible) helping members to replicate our work ○ Community CI (OpenHPC test-suite, MPI MTT, OpenMP tests, OpenBLAS CI) ○ Allow our engineers to develop the ecosystem ● Internal tests and benchmarks (via Jenkins, no infrastructure knowledge needed) ○ Testing new packages, libraries, compilers (comparison jobs, CI results, statistic analysis) ○ For more info visit linaro.org/hpc
Future Vendors to rely on Linaro for base OSS validation ● We have multiple vendors available ○ On a standardised infrastructure ○ Share our work ● OpenHPC Ansible recipes (with the OpenHPC community) ○ SDI (MrP, Jenkins, Ansible) helping members to replicate our work ○ Community CI (OpenHPC test-suite, MPI MTT, OpenMP tests, OpenBLAS CI) ○ Allow our engineers to develop the ecosystem ● Internal tests and benchmarks (via Jenkins, no infrastructure knowledge needed) ○ Testing new packages, libraries, compilers (comparison jobs, CI results, statistic analysis) ○ HPC Lab Setup https://github.com/Linaro/hpc_lab_setup Ansible OpenHPC installation recipe : https://github.com/Linaro/ansible-playbook-for-ohpc For more info visit linaro.org/hpc
Thanks!
Recommend
More recommend