SSI : Overview of Simulation Sof tware I nf rastructure f or Large Scale Scientif ic Applications Akira Nishida Depart ment of Comput er Science, Universit y of Tokyo J ST CREST 98 t h I PSJ SI GHPC Meet ing
Motivation Emergence of large scale scient if ic simulat ions in various f ields � Development of numerical libraries in J apan � Mainly developed in supercomput ing cent ers (on mainf rames and � vect or supercomput ers) in 1980s Cooperat ion wit h vendors (E.g. Fuj it su SSL I I ) � � Development in US ScaLAPACK (wit h BLAS and LAPACK), PETSc, Azt ec, et c. � Developed and used in nat ional laborat or ies � St andardized and modularized � Run on parallel comput ing environment s � Dist ribut ed via WWW (net lib et c.) since 1990s (also mirrored in J apan) � Demands f or reliable and port able parallel numerical libraries as � social inf rast ruct ure
Brief History of Basic Numerical Libraries Proj ect s in US and Europe � NATS (Nat ional Act ivit y t o Test Sof t ware) Proj ect by NSF st art ed in � 1970 EI SPACK (1972) and LI NPACK (1978) � St andardizat ion of level 1 BLAS (Basic Linear Algebra Subprograms) in � 1979 Development of LAPACK, LAPACK2, and ScaLAPACK by NSF and DARPA � during 1987-1995 PARASOL (An I nt egrat ed Programming Environment f or Parallel Sparse � Mat rix Solvers) since 1996 SciDAC (Scient if ic Discovery t hrough Advanced Comput ing) Program � st art ed in 2001 by DoE (Development of hardware/ sof t ware inf rast ruct ure f or t erascale comput ing)
Brief History of Basic Numerical Libraries (2) Proj ect s in J apan � Basic numerical libraries � I nt ernal use in nat ional supercomput ing cent ers � Program Syst em f or St at ist iacal Analysis wit h Least -Squares Fit t ing (T. Nakagawa and Y. Oyanagi et al., 1976-1982) Of f line dist ribut ion � A series of books by K. Murat a, T. Oguni, H. Hasegawa published f rom Maruzen Co.,Lt d. wit h f loppy disks No maj or nat ional proj ect s f or development of parallel numerical � libraries Parallel processing � Real World Comput ing (RWC) Proj ect by MI TI (M. Sat o, Y. I shikawa, � T. Kudo et al.) OBPLib: Obj ect orient ed librar y f or scient if ic comput ing on � dist ribut ed memory archit ect ures Omni OpenMP Compiler: Free OpenMP compiler f or shared memory � parallel archit ect ures
Features of the Project St art ed as a $2M and 5-year nat ional proj ect since Nov. 2002 � Complet e survey of domest ic and overseas research proj ect s � Cooperat ion wit h ot her proj ect s � I nvest igat e problems wit h exist ing libraries � Ref inement of sof t ware specif icat ion � Development � Select and evaluat e t arget archit ect ures (need t o predict mainst reams in 2007) � Fast prot ot yping of core component s (need f eedbacks) � St ar t wit h replacement of original libraries used in real applicat ions � Primary Target s: � Parallel eigensolvers � QR algorit hms (general purpose, real/ complex, symmet ric/ non-symmet ric) � Lanczos/ Arnoldi, Davidson met hods (select ed eigenpairs f or physical applicat ions) � Parallel linear solvers � Direct solvers (gener al purpose, real/ complex, symmet ric/ non-symmet ric, dense/ band/ sparse) � I t erat ive solvers (f or FDM and FEM) � Parallel f ast int egral t ransf orms � Fast Fourier t ransf orms (general purpose) � Fast Legendle Transf orm (climat e st udies) et c. � Port able obj ect -orient ed implement at ion � Dist ribut ion � Dist ribut ion via net work � Publicat ion of manuals f rom maj or publishers �
Core Research Fields Eigensolvers � Akira Nishida (Tokyo Univ.) � Eigensolvers f or large sparse eigenproblems and t heir parallelizat ion . � Linear solvers � Hidehiko Hasegawa (Tsukuba Univ.) � Development of direct / it erat ive linear solvers � Shao-Liang Zhang (Tokyo Univ.) � St udies on it erat ive solvers. Proposed GPBiCG ( product t ype it erat ive solver) . � Kengo Nakaj ima (RI ST) � General purpose solver f or f init e element problems � Kuniyoshi Abe (Gif u Shot oku Gakuen Univ.) � J oint researcher wit h S. L. Zhang on product t ype it erat ive solvers � Shoj i I t o (Tsukuba Univ.) � Development of direct solvers � Koh Hashimot o (Tokyo Univ.) � J oint researches wit h S. L. Zhang. St udies on machanical syst ems. � Akihiro Fuj ii (Tokyo Univ. Doct oral candidat e) � Parallel and vect or implement at ion of AMG precondit ioned CG met hod � Tomohiro Sogabe (Tokyo Univ. Doct oral candidat e) � St udies on it erat ive solvers. Proposed BiCR t ype met hod . �
Core Research Fields (2) Fast int egral t ransf orms � Reij i Suda (Tokyo Univ.) � Fast legendre t ransf orm f or spherical climat e analysis � Daisuke Takahashi (Tsukuba Univ.) � Development of opt imized parallel FFT � Akira Nukada (Tokyo Univ. Doct oral candidat e) � Development of opt imized parallel FFT � Parallel and dist ribut ed port able implement at ion � Akira Nishida � Reij i Suda � Hidehiko Hasegawa � Kengo Nakaj ima � Akira Nukada � Akihiro Fuj ii � Yuichiro Hourai (Tokyo Univ. Doct oral candidat e) � Parallel dist ribut ed comput at ion, opt imizat ion of broadcast communicat ions on t ree- � st ruct ured net works
Organization Eigensolvers Linear solvers Fast int egral t ransf orms I mplement at ion met hods Computing and networking environment Tsukuba Univ. I nst it ut e of I ndust rial Science J apan Met eorological Agency RI ST Eart h Simulat or Cent er Advancesof t Corp. AI ST Grid Research Cent er (MEXT I T Program) Et c. I nst it ut e f or Solid St at e Physics I nst it ut e of Medical Science Et c.
Schedule 2002 (5 2007 (7 Fiscal Year 2003 2004 2005 2006 mont hs) mont hs) Facilit ies Survey of Applicat ions Survey of sof t ware engineering Survey of hardware t echnologies Algorit hms Programming model I mplement at ion and verif icat ion Tut orials
Target (1): Architectures and Systems Survey of t rends and direct ion of hardware t echnologies � Trends of comput er archit ect ures � Higher densit y and lower power � E.g. I BM Blue Gene/ L: 130 t housand CPU - 180TFLOPS, � E.g. Fuj it su BioServer � Symmet ric mult it hreading � I BM Power, Sun Ult raSPARC, I nt el Pent ium & I t anium, et c. � Higher parallelism in every level of archit ect ure � I t becoming more import ant t o opt imize perf ormance of t he � libraries, while designing t hem growing more complex
Current Status: Architectures and Systems Predict comput ing environment t o be available in 5 years � Up-t o-dat e f acilit ies t o be updat ed every year � Current f acilit ies of SSI Proj ect � Shared memory pr ogramming environment : SGI Alt ix 3700 � (I nt el Madison 1.3GHz × 32 , Linux OS. 32GB main memory) Vect or processing environment : NEC SX-6i � Clust er comput ing environment : Dual I nt el Xeon 2.8GHz server � x 16, GbE int erconnect 10GbE enabled net working environment � ( Cisco C6509 ) Most of maj or archit ect ures have been � covered Port abilit y � Port abilit y can be t est ed easily on � t he SSI environment by t he developers
Current Status: Architectures and Systems (2) To GbE (→ 10GbE ) WAN Cisco Rout er C6509 Sun Fire 3800 Sun St orEdge T3 SGI Altix 3700 To Deskt ops NEC SX-6i GbE or 10GbE LAN HyperTransport I nf iniBand I nt erconnect ed I nt erconnect ed I t anium3 Clust er Opt eron Clust er
Current Status: Architectures and Systems (3) � Shared memory comput er SGI Alt ix 3700 � Memory bandwidt h perf ormance compared wit h Sun Fire 15k of Ult raSPARC I I I 900MHz x 72 , Solaris 8 , wit h STREAM benchmark , 1.8GB dat a Copy Copy 60000 60000 Scale Scale Add Add memory bandwidth (MB/s) memory bandwidth (MB/s) Triad Triad 50000 50000 40000 40000 30000 30000 20000 20000 10000 10000 0 0 10 20 30 40 50 60 70 5 10 15 20 25 30 number of threads number of threads
Target (2): Algorithms Promot ion of f undament al st udies � Promot ion of f undament al st udies by t he members (research meet ings) � Provide up-t o-dat e comput ing environment f or j oint researchers � Support port ing of exist ing libraries writ t en by t he members t o t he new � comput ing environment Planning t o develop a new librar ies based on a book “Numerical Libraries in � Fort ran 77” published by Maruzen Co.,Lt d. by Hasegawa et al. NEDO APC aut omat ic pallelizer developed has been implement ed on our � environment . Aut omat ically add OpenMP adapt ives � Fast release. Get f eedbacks f rom bet a users � A home page ht t p:/ / ssi.is.s.u-t okyo.ac.j p/ has been opened � Cooperat ion wit h AI ST PHASE proj ect ht t p:/ / phase.hpcc.j p/ , et c. � Light weight libraries wit h mimimum f unct ions f or large scale pr oblems � Keep balance wit h oo overheads and perf ormance � OO int erf ace + primit ive API s � Publish det ailed document s � Easy t o use �
Recommend
More recommend