�� �� Grand Large Grand Large INRIA High Performance Computing on P2P Platforms: Recent Innovations Franck Cappello CNRS Head Cluster et GRID group INRIA Grand-Large LRI, Université Paris sud. fci@lri.fr www.lri.fr/~fci May 20, 2003 Terena Conference 1
�� Grand Large Outline • Introduction (GRID versus P2P) • System issues in HPC P2P infrastructure – Internal of P2P systems for computing – Case Studies: XtremWeb / BOINC • Programming HPC P2P infrastructures – RPC-V – MPICH-V (A message passing library For XtremWeb) • Open issue: merging Grid & P2P • Concluding remarks May 20, 2003 Terena Conference 2
�� Grand Large Several types of GRID Node Features: Large sites •<100 Computing « GRID » Computing •Stables centers, •Individual Clusters credential •Confidence 2 kinds of distributed Large scale distributed PC •~100 000 systems systems Windows, •Volatiles Linux •No authentication « Desktop GRID » or •No « Internet Computing » confidence Peer-to-Peer systems May 20, 2003 Terena Conference 3
�� Grand Large Large Scale Distributed Computing • Principle – Millions of PCs – Cycle stealing • Examples – SETI@HOME • Research for Extra Terrestrial I • 33.79 Teraflop/s (12.3 Teraflop/s for the ASCI White!) – DECRYPTHON • Protein Sequence comparison – RSA-155 • Breaking encryption keys May 20, 2003 Terena Conference 4
�� Grand Large Large Scale P2P File Sharing • Direct file transfer after index consultation – Client and Server issue direct connections – Consulting the index gives the client the @ of the server • File storage – All servers store entire files – For fairness Client work as server too. • Data sharing Napster index file-@IP Association – Non mutable Data – Several copies no consistency check • Interest of the approach Napster user A Napster user B – Proven to scale up to million users (Client + Server) (Client + Server) – Resilience of file access • Drawback of the approach – Centralized index – Privacy violated May 20, 2003 Terena Conference 5
�� Grand Large Distributed Computing A central coordinator schedules tasks • Dedicated Applications on volunteer computers, Master worker paradigm, – SETI@Home, distributed.net, Cycle stealing – Décrypthon (France) • Production applications Client application – Folding@home, Params. /results. Genome@home, Coordinator – Xpulsar@home,Folderol, Parameters – Exodus, Peer review, • Research Platforms Internet – Javelin, Bayanihan, JET, Volunteer – Charlotte (based on Java), PC • Commercial Platforms – Entropia, Parabon, Volunteer PC Volunteer – United Devices, Platform (AC) Downloads and executes PC the application May 20, 2003 Terena Conference 6
�� Grand Large Peer to Peer systems (P2P) • User Applications All system resources – Instant Messaging -may play the roles of client – Managing and Sharing Information and server, – Collaboration -may communicate directly – Distributed storage Distributed and self-organizing infrastructure • Middleware – Napster, Gnutella, Freenet, Volunteer PC – KaZaA, Music-city, participating to the resource – Jabber, Groove, discovery/coordination • Research Projects – Globe (Tann.), Cx (Javalin), Farsite, – OceanStore (USA), Volunteer – Pastry, Tapestry/Plaxton, CAN, Chord, Internet • Other projects – Cosm, Wos, peer2peer.org, req. Volunteer – JXTA (sun), PtPTL (intel), Service Provider Client May 20, 2003 Terena Conference 7
�� Grand Large Merging Internet & P2P Systems: P2P Distributed Computing Allows any node to play different roles (client, server, system infrastructure) Client (PC) Server (PC) accept request PC Potential PC PC communications for result provide PC PC parallel applications P2P Client (PC) request accept system PC PC PC result Server (PC) PC provide Request may be related to Accept concerns Computations or data computation or data A very simple problem statement but leading to a lot of research issues: scheduling, security, message passing, data storage Large Scale enlarges the problematic: volatility, confidence, etc. May 20, 2003 Terena Conference 8
�� Grand Large “Three Obstacles to Making P2P Distributed Computing Routine” 1) New approaches to problem solving Data Grids, distributed computing, peer-to-peer, – collaboration grids, … 2) Structuring and writing programs Abstractions, tools – Programming Problem 3) Enabling resource sharing across distinct institutions Resource discovery, access, reservation, allocation; – Systems Problem authentication, authorization, policy; communication; fault detection and notification; … Credit: Ian Foster May 20, 2003 Terena Conference 9
�� Grand Large Outline • Introduction (large scale distributed systems) • System issues in HPC P2P infrastructure – Internal of P2P systems for computing – Case Studies: XtremWeb / BOINC • Programming HPC P2P infrastructures – RPC-V – MPICH-V (A message passing library For XtremWeb) • Open issue: merging Grid & P2P • Concluding remarks May 20, 2003 Terena Conference 10
�� Grand Large Basic components of P2P systems 1) Gateway (@IP, Web pages, etc.) Gateway -Give the @ of other nodes ? P2P -Choose a community, @IP d’un System P2P node P2P -Contact a community manager PC 2) Connection/Transport protocol for requests, results and control -Bypass firewalls, Firewall Internet, Intranet - Build a virtual address PC or LAN Firewall PC space (naming the participants: NAT) Resource Resource Resource (Tunnel, push-pull protocols) Internet Tunnel Resource May 20, 2003 Terena Conference 11
�� Grand Large Basic components of P2P systems 3) Publishing services (or resources) Internet, Intranet or LAN Allows the user to specify PC -what resources could be shared File -what roles could be played -what protocol to use (WSDL, etc.) CPU Disc space PC Internet, Intranet or 4) Resource discovery LAN (establish connection PC PC between client and Resource Request service providers) Request (Centralized directory, Request hierarchical Resource : directory, flooding, -file Resource search in topology) PC -service May 20, 2003 Terena Conference 12
�� Grand Large Resource Discovery in P2P Systems 1st Generation: Central server Central index Napster Peer ID 2nd Generation: Search query No central server: peer Gnutella, Peer ID peer Search query Flooding GET file peer peer Search query 3rd Generation: Start Interv Succ 1 [1,2) 1 2 [2,4) 3 CAN, Chord, 4 [4,0) 0 Distributed Hash Table 6 0 Pastry, etc. Start Interv Succ 7 1 2 [2,3) 3 (self organizing overlay 1 3 [3,5) 3 5 [5,1) 0 network: topology, routing) 2 6 5 3 4 Start Interv Succ 4 [4,5) 0 5 [5,7) 0 2 7 [7,3) 0 May 20, 2003 Terena Conference 13
�� Grand Large Additional component of P2P systems for Computing The role of the 4 previous components was A) to setup the system and B) to discover a set of resources for a client 5) Coordination sys.: - Receives Client computing request - Configures / Manages a platform (collect (virtual cluster manager) service proposals and attribute roles) - Schedules tasks / data distribution-transfers - Detects/recovers Faults Internet, Intranet or Coordination system LAN Centralized or Request PC Distributed Resource Request PC Request PC Resource PC May 20, 2003 Terena Conference 14
�� Grand Large Outline • Introduction (large scale distributed systems) • System issues in HPC P2P infrastructure – Internal of P2P systems for computing – Case Studies: XtremWeb / BOINC • Programming HPC P2P infrastructures – RPC-V – MPICH-V (A message passing library For XtremWeb) • Open issue: merging Grid & P2P • Concluding remarks May 20, 2003 Terena Conference 15
�� Grand Large XtremWeb: General Architecture • XtremWeb 1 implements a subset of the 5 P2P components • 3 entities : client/coordinator/worker (diff protect. domains) • Current implementation: centralized coordinator hierarchical Peer to Peer XW coordinator Coordinator PC coordinator Global Computing (client) PC Client/worker Internet or LAN PC PC PC Worker Worker Client/Worker May 20, 2003 Terena Conference 16
�� Grand Large XW: Worker Architecture Applications � Binary (legacy codes CHP en Fortran ou C) � Java (recent codes, object codes) OS � Linux, SunOS, Mac OSX, � Windows Auto-monitoring � Trace collection Protocol : firewall bypass hostRegister WorkRequest Worker Coordinat. Worker Coordinat. workResult XML RPC et SSL authentication and encryption workAlive May 20, 2003 Terena Conference 17
�� Grand Large XW: Client architecture A API Java � XWRPC � task submission � result collection � Monitoring/control Bindings � OmniRPC, GridRPC Applications � Multi-parameter, bag of tasks � Master-Wroker (iterative), EP Configure experiment Get work Client Coordinat. Worker Launch experiment Client Coordinat. Worker Put result Collect result May 20, 2003 Terena Conference 18
Recommend
More recommend