(Towards) Jungle Computing with Ibis Frank J. Seinstra, Jason Maassen, Niels Drost Computer Systems Group Department of Computer Science VU University, Amsterdam, The Netherlands
Jungle Computing (ComplexHPC) ● ‘Worst case’ computing as required by end-users ● Distributed ● Heterogeneous ● Hierarchical (incl. multi-/many-cores) ComplexHPC Spring School 2011 2
Why Jungle Computing? ● Scientists often forced to use a wide variety of resources to solve computational problems ● Prominent causes: ● Desire for scalability ● Software heterogeneity (e.g.: mix of C/MPI and CUDA) ● Distributed nature of (input) data ● Ad hoc hardware availability ● … ComplexHPC Spring School 2011 3
Example Application Domains ● Computational Astrophysics ● Example: AMUSE ● “Simulating the Universe on an Intercontinental Grid” - Portegies Zwart et al (IEEE Computer, Aug 2010) ● Climate Modeling ● Mixed / multi-model simulations ● Atmosphere, ocean, source rock formation, … - hardware: (potentially) very diverse - high resolution => speed & scalability - … ComplexHPC Spring School 2011 4
Image / Multimedia Analysis ● Aim: ● Automatic extraction of ‘semantic concepts’ from image sets and video streams ● Depending on specific problem & size of data set: ● May take hours, days, weeks, months, years… ComplexHPC Spring School 2011 5
Image / Multimedia Analysis (2) ● Applications in (a.o): ● Medical Imaging ● Security / Surveillance ● Multimedia Systems ● Astronomy ● Remote Sensing ● Application types: ● Real-time vs. off-line ● Fine-grained vs. coarse-grained ● Data-intensive / compute-intensive / information-intensive ComplexHPC Spring School 2011 6
Multimedia Content Analysis ● Need for user-friendly programming tools ● Shield domain-experts from all complexities of parallel, distributed, heterogeneous, and hierarchical computing ● Familiar (sequential) programming model(s) Solution: tool to make parallel & distributed computing transparent to user Jungle Computing Systems - familiar programming User - easy execution ComplexHPC Spring School 2011 7
Example: Color-based Object Recognition by a Grid-connected Robot Dog Seinstra et al (IEEE Multimedia, Oct-Dec 2007) Seinstra et al (AAAI’07: Most Visionary Research Award) ComplexHPC Spring School 2011 8
ComplexHPC Spring School 2011 9
Successful… ● …but many fundamental problem unsolved! ● Scaling up to very large systems ● Platform independence ● Middleware independence ● Connectivity (a.o. firewalls, …) ● Fault-tolerance ● … ● Software support tool(s) urgently needed! ● Jungle-aware + transparent + efficient ● No progress until ‘discovery’ of Ibis ComplexHPC Spring School 2011 10
The Ibis Project ● Offers all functionality to efficiently & transparently implement & run Jungle Computing applications ● Designed for dynamic / hostile environments ● Modular and flexible ● Allow replacement of Ibis components by external ones, including native code ● Open source ● Download: http://www.cs.vu.nl/ibis/ ComplexHPC Spring School 2011 11
General Requirements ● Resource independence ● Transparent / easy deployment ● Middleware independence & interoperability ● Jungle-aware middleware ● Jungle-aware communication ● Robust connectivity ● System-support for malleability and fault-tolerance ● Globally unique naming ● Transparent parallelism & application-level fault-tolerance ● Easy integration with external software (legacy codes) ● MPI, OpenCL, CUDA, C, C++, scripts, … ComplexHPC Spring School 2011 12
Ibis Software Stack Resource Independence: Java Transparent / Transp. Parallelism Easy Deployment: Application-level FT External software Constellation Jungle-aware Middleware Communication Independence Unique naming & Interoperability Malleability & FT Jungle-aware Robust Middleware Connectivity ComplexHPC Spring School 2011 13
JavaGAT ● Java Grid Application Toolkit ● High-level API for developing (Grid) applications independent of the underlying (Grid) infrastructure ● Use (Grid) services; file cp, resource discovery, job submission, … ● Overcomes problems, incl: ● Functionality may not work on all sites, or for all users, … ● Middleware version differences & complex codes… ● API standardized by OGF ● SAGA – Simple API for Grid Applications (a.o. with LSU) ● SAGA on top of JavaGAT (and v.v.) ● Tutorial by Thilo Kielmann Thursday May 12 ComplexHPC Spring School 2011 14
Zorilla ● A prototype P2P middleware ● A Zorilla system consists of a collection of nodes, connected by a P2P network ● Each node independent & implements all middleware functionality ● No central components ● Supports fault-tolerance and malleability ● Easily combines resources in multiple administrative domains ComplexHPC Spring School 2011 15
IbisDeploy ComplexHPC Spring School 2011 16
Ibis Portability Layer (IPL) ● Java-centric ‘run-anywhere’ communication library ● Sent along with your application ● “MPI for the Grid” (quote 2005) ● Supports fault-tolerance and malleability ● Resource tracking (JEL model) ● Open-world / Closed world ● Efficient ● Highly optimized object serialization ● Can use optimized native libraries (e.g. MPI, MX) ComplexHPC Spring School 2011 17
SmartSockets ● Robust connectivity Problems: Firewalls Network Address Translation (NAT) Non-routed networks Multi-homing … ● Always connection in 30 different scenarios ComplexHPC Spring School 2011 18
Ibis Programming Models (1) . ● Some IPL-based programming models: ● Satin: ● A divide-and-conquer model (see paper) ● MPJ: ● The MPI binding for Java ● RMI: ● Object-Oriented remote Procedure Call ● Jorus: ● A ‘user transparent’ model for multimedia applications ● … ComplexHPC Spring School 2011 19
Ibis Programming Models (2) ● Constellation (“the future of Ibis”): ● Generalized programming framework for ‘all’ Jungle Computing applications ● Automatically maps any application activity (task) onto any appropriate executor (HW) ● By way of ‘contexts’: ● Example: ● Activity's context: “I need to run on a GPU” ● Executor’s context: “I represent a GPU” ● Note: ● Activities may represent any type of task: ● Even incl. legacy codes, scripts, 3 rd party software, … ComplexHPC Spring School 2011 20
Ibis Results: Awards WebPie: A Web-Scale Parallel Inference Engine J. Urbani, S. Kotoulas, J. Maassen, N. Drost, F.J. Seinstra, F. van Harmelen, and H.E. Bal ComplexHPC Spring School 2011 21
Conclusions ● Jungle Computing is hard ● High-Performance Jungle Computing even harder ● While research into efficient & transparent Jungle -aware programming models has only just begun… ● …Ibis provides the basic functionality to efficiently & transparently overcome most Jungle Computing complexities ComplexHPC Spring School 2011 22
Conclusions (2) ● Successful, also because: ● Availability of dedicated hardware infrastructure ● 4 generations of DAS systems ● Focus on building solid software infrastructure ● No need to start from scratch over and over ● Focus on real-world problems on real-world systems ● e.g. participation in international competitions ComplexHPC Spring School 2011 23
Download www.cs.vu.nl/ibis/ ComplexHPC Spring School 2011 24
Recommend
More recommend