Jenkins + CVMS : Distributed Development, Centralised Delivery Bruce Becker | bbecker@csir.co.za Coordinator: SAGrid SANREN, Meraka Institute, CSIR
Outline ● What users want ● SAGrid VO – a catch-all VO with many applications ● Problem statements: ● Problem 1: ”the usual problem” – maintaining applications in a distributed computing environment ● Problem 2: ”Another usual problem” - maintaining a complex application inventory ● General solution : CVMFS + Jenkins ● Some specifics of SAGrid CI platform ● Outlook Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za
SAGrid as a catch-all VO ● The South African National Grid operates a catch-all VO which all South African researchers can use to access computing and data resources. ● SAGrid VO is not a domain-specific VO, so ● several widely-varying uses for the applications supported by this VO ● Applications requested by users or communities themselves Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za
What users want Some users want highly varied, modular application selection Amazing infrastructure Highly trained support Highly trained support Vertically integrated Highly specialised applications Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za
What users get sometimes Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za
The problem (1) - ”the usual problem” ● Software distribution was done mostly by hand”: ● Someone from the ops team develops script to install the application ● Apps installed via job submission ● Tags applied via script or by the job itself ● Issues: ● Major overhead of work ● Inconsistent installation procedures between applications and sites ● Bottleneck in porting applications (has to be done by someone in the VO) ● Duplication of effort, especially in dependencies of applications ● Difficult to manage application lifecycles Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za
The problem (2) - what about the community ? ● Managing the inventory in a catch-all VO can be complex when there are many applications ● Prioritising porting requests depends on the knowledge of the export porting the application ● Can lead to major delays in porting and deploying applications ● However, a user or community usually has an expert who knows how to tune, port and configure the application properly, as well as dependencies ● Usually, ”they” have to conform to ”us” - learn grid tools and terminology, etc Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za
Problem (3) : Changes to the playing fjeld ● New middleware stacks ● New architectures – GPGPU, ARM Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za
Questions to answer ● How do we lower the barrier to entry to the grid or cloud infrastructure ? ● How can the application expert prove to the resource provider that t he application will actually run on the execution environment of the site ? ● How can we manage the lifecycle of applications across multiple versions, architectures, configurations ? ● How can we ensure that once applications are ”certified”, they are actually available on as many sites as possible ? Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za
General Solution: Jenkins + CVMFS ● The issues outlined are ”typical” in a large software project ● Usually solved by judicious use of Continuous Integration system ● Once applications have been ”ported”, put them into a trusted repository ● Previously – built RPMs, but required site- admin intervention ● One-time configuration with CVMFS Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za
First, some changes ● Distribute the effort, centralise the tools ● Move repository from ”closed” SVN repo – https://ops.sagrid.ac.za/trac/svn/repo ● to git – https://github.com/SAGridOps/SoftwareInstallation ● Don't have to give write access to a single repo, instead accept pull requests ● Take advantage of all the Github infrastructure ● Expand possible contributors to those ”outside” the infrastructure ● Recognise individuals' contribution Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za
Recognise individuals... Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za
Decentralise the team Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za
Collaborate with code Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za
Let the robots do the work ● Define what we want to deploy – let the experts take care of how to deploy ● DevOps paradigm – same review/tag/release mechanisms on operations code as we have for scientific applications ● Teach a marketable skill ● Allow specialisation ● Enable remote management of complex services ● Ensure that published methodology is adopted methodology Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za
Quality Control and feedback ● Ensure that requested applications are included in the repo ● Provide testing and QA infrastructure ● Self-serve to users Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za
The CI environment ● Jenkins is extremely flexible... can do almost anything ● AuthN/AuthZ ● Currently using Github Oauth ● Take advantage of future Identity Federation ● We wanted to simulate different execution environments ● Already in production ● Planned for future ● Track and re-use depedendencies Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za
Matrix-based builds ● Independent different builds and build statuses for different configurations: ● Application name ● Version ● OS ● Architecture ● … can add specific tuning configurations... ● We can see exactly what's broken where – build more resilient integration code. Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za
T ypical workfmow Reads description of execution environment tests Application Dev/Stage env. Testing matrix developer Writes code to pass required tests Defines relevant tests in Jenkins Promote a build to CVMFS Infrastructure expert Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za
Dependency management simple case ● Common problem with applications : need a specific version of a compiler ● Compiling the compiler can itself be tricky... ● Jenkins tests the full dependency chain necessary Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za
Real-world application ● GADGET – astrophysics hydrodynamic simulations ● Many (levels of) dependencies Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za
Public Application Dashboard Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za
Authenticated view Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za
Generic build script # GADGET requires HDF5 FFTW2 ZLIB and # GADGET requires HDF5 FFTW2 ZLIB and openmpi openmpi module add ci Set up the module add ci module add fftw/2.1.5 environment module add fftw/2.1.5 module add hdf5 module add hdf5 module add openmpi module add openmpi module add gsl module add gsl Clean build, retrieve dependency artifacts rm -rf $FFTW_DIR rm -rf $FFTW_DIR tar xvfz /repo/$SITE/$OS/$ARCH/fftw/$FFTW_VERSION/build.tar.gz -C / tar xvfz /repo/$SITE/$OS/$ARCH/fftw/$FFTW_VERSION/build.tar.gz -C / rm -rf $HDF5_DIR rm -rf $HDF5_DIR tar xvfz /repo/$SITE/$OS/$ARCH/hdf5/$HDF5_VERSION/build.tar.gz -C / tar xvfz /repo/$SITE/$OS/$ARCH/hdf5/$HDF5_VERSION/build.tar.gz -C / rm -rf $OPENMPI_DIR rm -rf $OPENMPI_DIR tar xvfz /repo/$SITE/$OS/$ARCH/openmpi/$OPENMPI_VERSION/build.tar.gz -C / tar xvfz /repo/$SITE/$OS/$ARCH/openmpi/$OPENMPI_VERSION/build.tar.gz -C / rm -rf $GSL_DIR rm -rf $GSL_DIR tar xvfz /repo/$SITE/$OS/$ARCH/gsl/$GSL_VERSION/build.tar.gz -C / tar xvfz /repo/$SITE/$OS/$ARCH/gsl/$GSL_VERSION/build.tar.gz -C / Bruce Becker: Coordinator, SAGrid | bbecker@csir.co.za | http://www.sagrid.ac.za
Recommend
More recommend