γλ γλ Grand Large Grand Large PCRI / INRIA Calcul Global, Desktop Grids et XtremWeb Franck Cappello PCRI / INRIA Grand-Large LRI, Université Paris sud. fci@lri.fr www.lri.fr/ ~ fci 23 June 2004 Ecole GridUse 1 γλ Grand Large Outline • I ntroduction : Desktop Grid, foundations for a Large Scale Operating System. • Some architecture issues • User interfaces (XtremWeb) • Fault tolerance (XtremWeb) • Security (Trust) • Final Remarks (what we have learned so far) 23 June 2004 Ecole GridUse 2
γλ Grand Large Several types of GRID Node Features: Large sites •<100 Computing « GRID » Computing •Stables centers, •Individual Clusters credential •Confidence 2 kinds of distributed Large scale distributed PC •~100 000 systems systems Windows, •Volatiles Linux •No authentication « Desktop GRID » or •No « Internet Computing » confidence Peer-to-Peer systems 23 June 2004 Ecole GridUse 3 γλ Grand Large DGRid for Large Scale Distributed Computing • Principle – Millions of PCs – Cycle stealing • Examples – SETI@HOME • Research for Extra Terrestrial I • 33.79 Teraflop/ s (12.3 Teraflop/ s for the ASCI White!) – DECRYPTHON • Protein Sequence comparison – RSA-155 • Breaking encryption keys 23 June 2004 Ecole GridUse 4
γλ Grand Large DGrid for Large Scale P2P File Sharing • Direct file transfer after index consultation – Client and Server issue direct connections – Consulting the index gives the client the @ of the server • File storage – All servers store entire files – For fairness Client work as server too. • Data sharing – Non mutable Data – Several copies no index consistency check file-@IP Association • Interest of the approach – Proven to scale up to million users user A user B – Resilience of file access (Client + Server) (Client + Server) • Drawback of the approach – Centralized index – Privacy violated 23 June 2004 Ecole GridUse 5 γλ Grand Large DGrid for Large Scale Data Storage/ Access Freenet • Principle Intermemory – Millions of PCs – “Disk space” stealing • Storing and accessing Providing Global- stored files on participant Scale Persistent Data nodes. Files are stored as segments. Segments are replicated for availability. Distributed Data Integration • Collecting and integrating data coming from numerous devices Ubiquitous Projet Us storage 23 June 2004 Ecole GridUse 6
γλ Grand Large DGrid for Networking NETI@home Collects network performance A set of PCs on the statistics from Internet (Lab networks) end-systems Coordinated for Networking experiments A set of PCs on the Internet (ADSL) coordinated for measuring the communication performance 23 June 2004 Ecole GridUse 7 γλ Grand Large Computational/ Networking DGrids For which application domains DGrids are used? • Computational Applications A central coordinator – SETI@Home, distributed.net, schedules tasks/ coordinate – Décrypthon (France) – Folding@home, Genome@home, actions on a set of PCs, – ClimatePrediction, etc. • Networking applications Client application – Planet Lab (protocol design, etc.) Params. / results. Coordinator/ – “La grenouille” (DSL perf eval.) – Porivo (Web server perf testing) Resource Disc. Parameters • Research Project – Javelin, Bayanihan, JET, – Charlotte (based on Java), Network – Condor, Xtrem W eb, P3 , BOINC PC • Commercial Platforms – Datasynapse, GridSystems PC – United Devices, Platform (AC) PC – Cosm 23 June 2004 Ecole GridUse 8
γλ Grand Large Communication/ Storage DGrids (P2P) For which application domains DGrids are used? • Communication Applications A resource discovery/ lookup – Jabber, etc. (Instant Messaging) – Napster, Gnutella, Freenet, Kazaa, engine establishes: (Sharing Information) -relation between a client – Skipe, (Phone over IP) and server(s) -a communication between • Storage Applications 2 participants – OceanStore, US, etc. (distributed Storage) – Napster, Gnutella, Freenet, Kazaa, (Sharing Information) Resource Discovery/ Coordinator • Research Projects Service – Globe (Tann.), Cx (Javalin), Farsite, – Pastry, Tapestry/ Plaxton, CAN, Chord, provider – Xtrem W eb Network • Other projects req. – Cosm , WebOS, Wos, peer2peer.org, Service – JXTA (sun), PtPTL (intel), Provider Client 23 June 2004 Ecole GridUse 9 γλ Grand Large Historical perspective XtremWeb research 1999 1995 Today Tim e Cluster of Clusters NOW Beow ulf CLUSTER Globus, Ninf, Legion, Meta- Com puting DataGrid, GGF Netsolve W SRF OGSA GRI D I - W AY Distributed. Cycle stealing net Condor COSM SETI @Hom e BOI NC CALCUL GLOBAL I nternet Com p. Javelin,CX, Xtrem W eb XW 2 Atlas, charlotte P3 Napster, Gnutella, DHT: Pastry, Freenet Distributed Tapestry, P2 P System s Can Chord DNS, m ail 23 June 2004 Ecole GridUse 10
γλ Grand Large Outline • Introduction : Desktop Grid, foundation for a Large Scale Operating System. • Som e architecture issues • User interfaces • Fault tolerance • Security (Trust) • Final Remarks (what we have learned so far) 23 June 2004 Ecole GridUse 11 γλ Grand Large Computational Desktop Grids Allows any node to play different roles (client, server, system infrastructure) Client (PC) Server (PC) accept request Potential communications for result Coordination/ provide parallel applications Match making/ Client (PC) request Sheduling/ accept Fault Tolerance result Server (PC) provide Request may be related to Accept concerns Computations or data computation or data A very simple problem statement but leading to a lot of research issues: scheduling, security, fairness, race conditions, message passing, data storage Large Scale enlarges the problematic: volatility, confidence, etc. 23 June 2004 Ecole GridUse 12
γλ Grand Large Some facts about the heterogeneity: Credit O. Richard, G. Da Costa A study made by IMAG Performance of participant team according to their ranking Performance Performance Rank Rank Follow a Zipf law : Performance (rank) = C/ rank � law of the 90% / 10% Up to 4 orders of magnitudes between the extremes 23 June 2004 Ecole GridUse 13 γλ Grand Large Some facts about the users: A study made by IMAG Credit O. Richard, G. Da Costa More User characteristics of French ADSL 4373 One week of the ADSL as seen by Lagrenouille users noon 2PM Number of connected users Accumulated number of users Classification considering Sunday User connection vector + hamming distance of 1 Between user conn. Vect (508 classes) Date Class number Connection increases during daytime up to a maximum Users are behaving differently (if we consider a vector of hours (1 if connected,0 if not), the 4373 users of Jan 7 belong to 1710 different classes !!!) 23 June 2004 Ecole GridUse 14
γλ Grand Large Some facts about dynamicity: Credit O. Richard, G. Da Costa A study made by IMAG One week of the ADSL as seen by Lagrenouille Mean Standard Dev. Number of different user per day 4175 203 Number of different users per 1402 432 hours Number of connected days for a 4.3 2.2 user Number of connected hours for a 34 40! user Number of users is quite stable across the week days In 1 hour up to ½ of the users may change User connection frequency to ADSL is quite disparate among the users 23 June 2004 Ecole GridUse 15 γλ Grand Large Architecture Fundamental mechanisms and Scheduling modes Firewall Firewall Transport layer: Firewalls, PC PC Internet NAT, Proxy : Resource Tunnel • XtremWeb, BOINC: Ad Hoc • P3 : Jxta) Centralized or distributed Agent Agent Fundamental components: Res. Disc. Client Worker Res. Disc. Client Worker Coordinat. Coordinat. Engine Engine Infrastructure XtremWeb Pull job interaction mode: PUSH or PULL Datasynapse (non permanent connection vserus Condor connected). Push job 23 June 2004 Ecole GridUse 16
γλ Grand Large Architecture Data transfers and Resource types Datasynapse BOINC XtremWeb P2P systems get Submit Submit Data transfer mode: get job Submit Job+ data job Job+ data get job job P2P, Data Server or Coordinator Put get get data Data data PC Dual-PC PC Cluster What kind of resources can be harnessed? 1 agent 1 agent Agent compliant 1 thread X threads With 3thd party scheduler 23 June 2004 Ecole GridUse 17 γλ Grand Large Resource Discovery/ Coordination Resource discovery and orchestration Resource Discovery: 1st Gen: centralized/ hierar. 2d Gen: fully dist. 3rd Gen: DHT Peer ID Search query 6 0 Start Interv Succ 7 1 2 [2,3) 3 1 peer 3 [3,5) 3 5 [5,1) 0 Peer ID peer 2 6 y r e u GET file q h 5 3 c r a e peer 4 S peer Start Interv Succ 4 [4,5) 0 Search query 5 [5,7) 0 2 7 [7,3) 0 Action coordination: Centralized Fully dist. client client Submit Res. Disc jobs Action Order Action Action Action Order Order Order 23 June 2004 Ecole GridUse 18
Recommend
More recommend