pymct and pycpl refactoring ccsm using python
play

PyMCT and PyCPL: Refactoring CCSM Using Python Michael Tobis (1) , - PowerPoint PPT Presentation

PyMCT and PyCPL: Refactoring CCSM Using Python Michael Tobis (1) , Michael Steder (1) , Robert L. Jacob (2,3) , Raymond T. Pierrehumbert (1) , Everest T. Ong (4) , & J. Walter Larson (2,3,5,6) Affiliations: (1) Department of Geosciences,


  1. PyMCT and PyCPL: Refactoring CCSM Using Python Michael Tobis (1) , Michael Steder (1) , Robert L. Jacob (2,3) , Raymond T. Pierrehumbert (1) , Everest T. Ong (4) , & J. Walter Larson (2,3,5,6) Affiliations: (1) Department of Geosciences, University of Chicago; (2) Mathematics and Computer Science Division, Argonne National Laboratory (3) Computation Institute, University of Chicago (4) Department of Atmospheric and Oceanic Sciences, University of Wisconsin (5) ANU Supercomputer Facility, The Australian National University (6) Australian Partnership for Advanced Computing (APAC) Presented at the Python Workshop at the Biennial Computational Techniques and Applications (CTAC ‘06) Townsville, Queensland, Australia 6 July 2006 1

  2. Overview • Climate Models and the Parallel Coupling Problem • More specifically, CCSM, MCT, and CPL6 • Enter Python • More than you probably want to know about MCT • Python Bindings for MCT, a.k.a. pyMCT (plus example) • Re-implementation of CPL6, a.k.a. pyCPL • Conclusions + Future Work CTAC ’06 Python Workshop 2

  3. ACT I: Parallel Coupled Models and CCSM CTAC ’06 Python Workshop 3

  4. Schematic (Directed Graph) for the Climate System SEA- ATM ICE Arcs represent data to be delivered from source to target (states and fluxes) Nodes represent subsystem models OCN LAND CTAC ’06 Python Workshop 4

  5. Complexity Barriers The traditional approach was: Model the individual subsystems (atmosphere, ocean, sea-ice, and land) in isolation • Idealize interactions with outside world through prescription (e.g., climatological or time-series data) or simplified physics (e.g., bucket hydrology, swamp or mixed-layer ocean) • Why? Three complexity barriers to overcome: • Knowledge , a consequence of specialization • overcome through interdisciplinary teams • Computational , i.e., getting all the math done • overcome by faster processors, better algorithms, and parallel computin g • Software : build system, language barriers, interactions between physics packages (very important here) CTAC ’06 Python Workshop 5

  6. One of Life’s Little Ironies... • The solution to one problem can create a new problem • In this case, parallel computing in the form of message-passing parallelism (let’s face it--MPI is pretty much the standard) helps surmount the computational complexity barrier • Distributed-memory parallelism complicates intercourse between coupled models due to the traffic in distributed data • This is the Parallel Coupling Problem CTAC ’06 Python Workshop 6

  7. Parallel Coupling Problem Given: N mutually interacting models (C 1 ,C 2 ,...C N ), each of which may employ message-passing parallelism Goal: Build an efficient parallel coupled model Aspects of the problem: • Architecture • Parallel data processing • Environment • Language barriers Often viewed simplistically as the • Build issues “M x N” data transfer problem CTAC ’06 Python Workshop 7

  8. Architectural Aspects Two sources that shape parallel coupled models: • Science of the system under study: • Connectivity--who talks to whom? • Coupling event scheduling (e.g., periodic?) • Domain overlap--lower-dimensional vs. colocation • Timescale separation/interaction & domain overlap • Tightness • Implementation choices: This presents a large and • Resource allocation difficult-to-analyze decision space (see Larson’s talk • Scheduling of model execution from CTAC ‘06) • Number of executable images • Mechanism CTAC ’06 Python Workshop 8

  9. Parallel Data Processing • Description of data to be exchanged during coupling • Physical fields/variables • Mesh or representation associated with the data • Domain decomposition • Transfer of data--a.k.a. the MxN problem • Transformation of data • Intermesh interpolation/transformation between representations (and associated conservation issues) • Time transformation • Diagnostic/variable transformations • Merging of data from multiple sources CTAC ’06 Python Workshop 9

  10. CCSM • CCSM := Community Climate System Model • Four physical components: Atmosphere, Ocean, Sea-Ice, and Land- Surface models • One coupler component • Together, they comprise a hub and spokes architecture (see box at right) • As implemented, CCSM is an application-specific software framework CTAC ’06 Python Workshop 10

  11. CCSM’s Coupler CPL6 CPL6 enables one to do--with aplomb-- some neat things to CCSM. Modification of fields under exchange via “Configurable Model Substitution Plugs” CTAC ’06 Python Workshop 11

  12. This is Great...But... • Alteration of CCSM’s architecture is more difficult, and one must re-code the coupler MAIN in terms of the bits and pieces from the CPL6 toolkit+library--in Fortran :-( • Number of components • Changes in scientific details of the couplings beyond the most simple alterations • e.g., currently take-it-or-leave-it temporal advance coupling, not anything nearly as sophisticated as predictor/corrector But this is exactly what many scientists want to do! CTAC ’06 Python Workshop 12

  13. CCSM’s Fortran Software Stack CTAC ’06 Python Workshop 13

  14. Attempted Object- Oriented Programming in Fortran...or, Don’t Try This at Home... CTAC ’06 Python Workshop 14

  15. OO Programming in Fortran-- Where’s the Swindle? • The introduction of Derived Types and Explicit Interfaces in the Fortran90 standard have allowed one to implement some OO concepts (see Decyk et al. , 1996 and 1997) • Encapsulation and Information Hiding • Inheritance • Polymorphism • But, one must implement them--they are not available as part and parcel of the language, and the technique used is through a design pattern called delegation • The 2003 Fortran Standard introduces the notion of a Class, but when this feature will be ubiquitous in compilers is unclear CTAC ’06 Python Workshop 15

  16. MCT is a Collection of Fortran Datatypes... SparseMatrixPlus Data Description KEY Data Transfer Data Transformation Rearranger GeneralGrid Accumulator SparseMatrix Router AttrVect GlobalSegMap MCTWorld ...and a comprehensive set of support routines (a.k.a methods ) CTAC ’06 Python Workshop 16

  17. Seven the Hard Way • MCT has seven classes that are linked via a class hierarchy • MCT is an attempt to follow OO discipline as best as possible in a Fortran context; i.e., provide a comprehensive set of support methods for each class • Inheritance was implemented the hard way through hard-coding of these relationships (i.e., delegation) • This is the primary reason that MCT is such a small set of classes, and why the datatypes we will encounter in CPL6 are not in MCT CTAC ’06 Python Workshop 17

  18. CPL6 Fortran Datatypes Looks Like a Job for Python! CTAC ’06 Python Workshop 18

  19. The Case for Python • Python offers true OO programming, thus CPL6 class implementations are far easier to implement • Most of the complexity of the coupler is at initialization time: scripting penalty is irrelevant • Most of the coupler loop is idle; calculations can be farmed out to a compiled language ('Numeric' or 'numpy' modules) • Coupler itself is approximately 2kLOC of Fortran, re- implementation in Python should be doable • Realization of long-term goal of roll-your-own/disposable couplers CTAC ’06 Python Workshop 19

  20. PyCCSM/PyCPL/PyMCT Parts List • Python bindings for MCT, a.k.a. pyMCT • Python-based reimplementation of Multi-Process Handshaking utilities (MPH) • Implementation of CPL6 datatypes in Python (as true classes), a.k.a. pyCPL • Python implementation of a specific coupler + legacy component models = pyCCSM CTAC ’06 Python Workshop 20

  21. ACT II: MCT CTAC ’06 Python Workshop 21

  22. Q. Why go into so much detail about MCT? A. Because PyMCT is the clear non- climate spin-off product from this project CTAC ’06 Python Workshop 22

  23. MCT’s Universe of Discourse • Support coupling of MPI-based MPP models • Data transfer using a peer commmunication model • Description of physical meshes and associated field data through linearization • Data transfer and transformation are viewed as multi-field, pointwise operations • We leave numerous, high-level operations to the user’s discretion (e.g., choice of linearization and interpolation schemes), while concentrating on automation of complex (but important!) low-level operations CTAC ’06 Python Workshop 23

  24. Coupling as Peer Communication • MCT’s organizing principle is the component model , or component ( not same as CORBA, CCA, ESMF, JavaBeans) • An MCT component is merely a model that is part of the larger system and participates in coupling • In MCT, components interact directly as peers • The user codes these connections into the model source CTAC ’06 Python Workshop 24

  25. Linearization of Multi- Dimensional Space • Linearization (first used in MxN schemes by UMD’s METACHAOS) is the mapping from an n-tuple index space to a single global location index • This approach allows for a single representation of grids/arrays of aribtrary dimension CTAC ’06 Python Workshop 25

  26. Index Space • Definition: An M-dimensional index space is subset of Z M , each element of which can be uniquely identified by an M -tuple of integers ( i 1 , i 2 ,...,i k ,..., i M ) • This is what we normally use to describe Cartesian meshes and associated field data stored in multidimensional arrays CTAC ’06 Python Workshop 26

Recommend


More recommend