overview overview
play

Overview Overview Grid/NetSolve Grid enabled software Allows easy - PDF document

GridSolve: A Seamless Bridge Between : A Seamless Bridge Between GridSolve the Standard Programming Interfaces the Standard Programming Interfaces and Remote Resources and Remote Resources Jack Dongarra University of Tennessee and Oak


  1. GridSolve: A Seamless Bridge Between : A Seamless Bridge Between GridSolve the Standard Programming Interfaces the Standard Programming Interfaces and Remote Resources and Remote Resources Jack Dongarra University of Tennessee and Oak Ridge National Laboratory 2/25/2006 1 Overview Overview ♦ Grid/NetSolve � Grid enabled software � Allows easy access to remote resources ♦ No magic � Someone has to write a program � The program may run on a parallel computer ♦ NetSolve is a tool for distributed computing $countupmm 2 1

  2. The Grid $countupmm 3 What is NetSolve NetSolve? ? What is ♦ Client-server RPC-like system � Designed for ease-of-use ♦ Interactions mediated by an agent � e.g. scheduling, tracking, fault tolerance ♦ Dynamic service bindings � Client does not need to have stubs for the services that it wishes to use ♦ Multiple clients � C, Fortran, Matlab, Java, Mathematica, Octave ♦ Extended to support GridRPC API � Part of GGF working group defining a standard API $countupmm 4 2

  3. University of Tennessee’ ’s s University of Tennessee NetSolve Grid Enabled Server NetSolve Grid Enabled Server ♦ NetSolve is an example of a Grid based hardware/software/data server. ♦ Based on a Remote Procedure Call model but with … � resource discovery, dynamic problem solving capabilities, load balancing, fault tolerance asynchronicity, security, … ♦ Easy-of-use paramount ♦ Its about providing transparent access to resources. ♦ Legacy codes easily wrapped into services $countupmm 5 GridSolve Architecture GridSolve Architecture R e sour c e disc ove r y Sc he duling Agent L oad balanc ing F ault tole r anc e r e que st se r ve r list cluster data cluster ` r e sult cluster Client cluster [x,y,z,info] = gr idsolve (‘dge sv’, A, B) Can be from Matlab, C, F ortran, Python, $countupmm Java, Mathe ma tic a, E xc e l, … 6 3

  4. NetSolve Client NetSolve Client ♦ Function Based Interface. Client ♦ Client program embeds call from NetSolve’s API to access additional resources. ♦ Interface available to C, Fortran, Matlab, and Mathematica. ♦ Opaque networking interactions. ♦ NetSolve can be invoked using a variety of methods: blocking, non- blocking, task farms, … $countupmm 7 NetSolve Client NetSolve Client Client ♦ Intuitive and easy to use. ♦ Matlab Matrix multiply e.g.: � A = matmul(B, C); A = netsolve(‘matmul’, B, C); • Possible parallelisms hidden. $countupmm 8 4

  5. NetSolve Client NetSolve Client Client makes request to agent. i. Client ii. Agent returns list of servers. iii. Client tries first one to solve problem. $countupmm 9 NetSolve Agent NetSolve Agent Agent ♦ Name server for the NetSolve system. ♦ Information Service � client users and administrators can query the hardware and software services available. ♦ Resource scheduler � maintains both static and dynamic information regarding the NetSolve server components to use for the allocation of resources $countupmm 10 5

  6. NetSolve Agent NetSolve Agent Agent ♦ Resource Scheduling (cont’d): � CPU Performance. � Network bandwidth, latency. � Server workload. � Problem size/algorithm complexity. � Calculates a “Time to Compute.” for each appropriate server. � Notifies client of most appropriate server. $countupmm 11 Basic Usage Scenarios Basic Usage Scenarios ♦ Grid based numerical library routines ♦ “Blue Collar” Grid Based � User doesn’t have to have Computing software library on their � Does not require deep machine, LAPACK, SuperLU, ScaLAPACK, PETSc, ARPACK, … knowledge of network programming ♦ Task farming applications � “Pleasantly parallel” execution � Level of expressiveness eg Parameter studies right for many users � Scavenge cycles � User can set things up, ♦ Remote application execution no “su” required � Complete applications with user � In use today, up to 130 specifying input parameters and servers on the receiving output experimental grid ♦ Can plug into Globus, $countupmm Condor, NINF, … 12 6

  7. Task Farming - - Task Farming Multiple Requests To Single Problem Multiple Requests To Single Problem ♦ A Solution: � Many calls to netslnb( ); /* non-blocking */ ♦ Farming Solution: � Single call to netsolve_farm( ); ♦ Request iterates over an “array of input parameters.” ♦ Adaptive scheduling algorithm. ♦ Useful for parameter sweeping, and independently parallel applications. $countupmm 13 Server Proxies – – Hide Parallelism Hide Parallelism Server Proxies Agent Server Server Server Server Server Server NetSolve NetSolve Client Client LFC (LAPACK for Clusters), NetSolve System Condor, ScaLAPACK, etc. $countupmm User maybe unaware of parallel processing 14 7

  8. GridSolve Usage with VGrADS GridSolve Usage with VGrADS ♦ Simple-to-use access to complicated software libraries, with no knowledge of grid based computing. ♦ Selection of best machines in your grid to service user request ♦ Portability � Non-portable calls can be run from a client using RPC like mechanisms as long there is a server provisioned with the code ♦ Legacy codes easily wrapped into services ♦ Plug into VGrADS Framework ♦ Using the vgES for resource selection and launching of application: � Integrated performance information � Integrated monitoring � Fault prediction � Integrating the software and resource information repositories $countupmm 15 Virtual Grid Execution System (vgES vgES) ) Virtual Grid Execution System ( ♦ A Virtual Grid (VG) takes ♦ Virtual Grid Execution � Shared heterogeneous resources System (vgES) implements VG � Scalable information service � VG Definition Language ♦ and provides (vgDL) � An hierarchy of application- defined aggregations (e.g. � VG Find And Bind (vgFAB) ClusterOf) with constraints (e.g. � VG Monitor (vgMON) processor type) and rankings � VG Application Launch (VgLAUNCH+DVCW) � VG Resource Info (vgAgent) vgDL Description Application Application Grid vgDL VG Resource Universe vgES API s I nformation Services vgFAB vgAgent VG Successfully VG Bound VG vgMON vgLAUNCH Candidates Resource Managers $countupmm DVCW Grid 16 Virtual Grid Resources 8

  9. VGrADS/GridSolve Architecture VGrADS/GridSolve Architecture y que r Se r vic e Se r vic e Agent Catalog Catalog n o i t a c o l r e a w t f o s e que st vgDL r info [x,y,z,info] = V r i e giste r r t gr idsolve (‘solve r ’, A, B) ve r u a l Se r G r i d Star t se r ve r Data se nt & app star te d ` r e sult ransfe r Client T Process restarted Process Killed Softwar e Re positor y $countupmm 17 Data Persistence Data Persistence ♦ Chain together a sequence of NetSolve requests. ♦ Analyze parameters to determine data dependencies. Essentially a DAG is created where nodes represent computational modules and arcs represent data flow. ♦ Transmit superset of all input/output parameters and make persistent near server(s) for duration of sequence execution. ♦ Schedule individual request modules for execution. $countupmm 18 9

  10. Data Persistence (cont’ Data Persistence (cont ’d) d) command1(A, B) sequence(A, B, E) Client Client Server Server result C netsl_begin_sequence( ); input A, netsl(“command1”, A, B, C); netsl(“command1”, A, B, C); intermediate output C command2(A, C) netsl(“command2”, A, C, D); netsl(“command2”, A, C, D); netsl(“command3”, D, E, F); Client Server Server netsl(“command3”, D, E, F); result D netsl_end_sequence(C, D); intermediate output D, input E command3(D, E) Client Client Server Server result F result F $countupmm 19 Current SInRG Infrastructure Current SInRG Infrastructure Federated ♦ Ownership: CS, Chem Eng., Medical School, Computational Ecology, El. Eng. Real applications, ♦ middleware development, Industry Partners : logistical Microsoft, Sun, Dell, Cisco, networking $countupmm Foundry, Dolphin, Myracom 20 10

  11. NetSolve- - Things Not Touched On Things Not Touched On NetSolve ♦ Integration with other NMI tools � Globus, Condor, Network Weather Service ♦ Security � Using Kerberos V5 for authentication. ♦ Monitor NetSolve Network � Track and monitor usage ♦ Fault Tolerance ♦ Local / Global Configurations ♦ Dynamic Nature of Servers ♦ Automated Adaptive Algorithm Selection � Dynamic determine the best algorithm based on system status and nature of user problem ♦ NetSolve evolving into GridRPC � Being worked on under GGF with joint with NINF $countupmm 21 Software at: http://icl.cs.utk.edu/netsolve/ NetSolve Team Team NetSolve ♦ Sudesh Agrawal ♦ Don Fike ♦ Eric Meek ♦ Keith Seymour ♦ Zhiao Shi ♦ Asim YarKhan $countupmm 22 11

Recommend


More recommend