plug in scheduler design for a distributed environm ent
play

Plug-in Scheduler Design for a Distributed Environm ent Eddy Caron - PDF document

Plug-in Scheduler Design for a Distributed Environm ent Eddy Caron Andreea Chis Frederic Desprez Alan Su ANR-05-CIGC-11 Outline 1 Grid and Grid-RPC 2 Related Work 3 DIET Overview 4 DIET Scheduling 5 Experimentation 6


  1. Plug-in Scheduler Design for a Distributed Environm ent Eddy Caron Andreea Chis Frederic Desprez Alan Su ANR-05-CIGC-11 Outline 1 Grid and Grid-RPC 2 Related Work 3 DIET Overview 4 DIET Scheduling 5 Experimentation 6 Conclusion and future work 2 1

  2. Outline 1 Grid and Grid-RPC 2 Related Work 3 DIET Overview 4 DIET Scheduling 5 Experimentation 6 Conclusion and future work 3 Grid and Grid-RPC � Com putational Grids � Sharing � Selection � Aggregation of geographically distributed computational resources presenting them as a single unified resource for solving large-scale compute and data intensive applications. � Grid Platform s � Heterogeneous computational resources � Irregular network topologies � Dynamic resource performance 4 2

  3. Grid and Grid-RPC � Resource m anagem ent – crucial aspect for efficient Grid environments � Application Service Provider (ASP) model � Semi-transparent access to computing servers � Coarse granularity � Easy to use even for non-experts � Close to the Remote Procedure Call (RPC) paradigm � Middleware to facilitate the clients access to remote resources � Network Enabled Servers (NES) : Ninf, NetSolve, OmniRPC, DIET. 5 The Grid-RPC Paradigm � Elaborated by the Global Grid Forum ( OGF now) � Standardizes the API of the RPC used by NES � Based on 5 entities : � Client - user’s interface & request submission to servers � Server - receive clients’ requests & executes software modules on behalf of them � Data-base - static and dynamic information about hardware and software resources � Scheduler - requests receival & mapping decision based on info from database � Monitor - observations about resources status & information storing in the database 6 3

  4. The Grid-RPC Paradigm Request Agent Identifier ` Registering Call Client Results Server 7 The Grid-RPC Paradigm � AGENT ( Registry or Resource Managem ent System ) � Central component of Grid-RPC systems � Chooses servers able to solve a request on behalf of clients � Main task: load-balancing between servers � Gets information about available servers � Asks the performance database for information � Some scalability problems may occur � Unique scheduler � Unique resource management system � Centralized (or duplicated) in NetSolve or Ninf � Distributed in DIET 8 4

  5. Outline 1 Grid and Grid-RPC 2 Related Work 3 DIET Overview 4 DIET Scheduling 5 Experimentation 6 Conclusion and future work 9 Related W ork � Few middleware allow scheduling internals to be tuned for specific application � APST – allows choosing the scheduling heuristic (Max-min, Min-min, X-sufferage) � GrADS & AppLeS – scheduling heuristics for different application classes � GrADS - application specific performance models • Program Preparation System • Program Execution System • Binder � Recent work towards coping with dynamic platform performance at runtime � Condor – ClassAds language 10 5

  6. Outline 1 Grid and Grid-RPC 2 Related Work 3 DIET Overview 4 DIET Scheduling 5 Experimentation 6 Conclusion and future work 11 DI ET Overview � DIET for D istributed I nteractive E ngineering T oolbox � Hierarchical architecture for improved scalability � Implemented in CORBA thus beneficiating from standardized, stable services provided by freely- available and high performance CORBA implementations 12 6

  7. DI ET Overview � DI ET Platform Architecture � Client - an application that uses the Client DIET infrastructure to solve problems MA MA � Servers (SeDs) - perform computations on data sent by a client LA LA LA LA � Agents - facilitate the service location and SeD SeD SeD SeD SeD SeD SeD SeD invocation interactions of clients and SeDs 13 DI ET Overview � Progress of a DI ET call � The client requests a service from a Master Agent � MA propagates the client MA request through its subtrees ` � Each SeD responds with a Client profile and performance estimation LA � LAs sort children responses and forward them up in the hierarchy SeD SeD SeD � MA returns a list of candidate SeDs to the Client � Client sends input data and the SeD launches the service 14 7

  8. Outline 1 Grid and Grid-RPC 2 Related Work 3 DIET Overview 4 DIET Scheduling 5 Experimentation 6 Conclusion and future work 15 DI ET Scheduling � First version - FIFO principle � Mono-criteria scheduling based on application-specific performance predictions � Round-robin scheduling scheme � Plug-in schedulers 16 8

  9. DI ET Scheduling � SeD level � Performance estimation vector - dynamic collection of performance estimation values (modular design) • Performance measures available through DIET » FAST-NWS performance metrics » Time elapsed since the last execution » CoRI-Easy • Developer defined values � Performance Estimation Function � Standard estimation tags for accessing the fields of an performance estimation vector: » EST_FREEMEM » EST_TCOMP » EST_TIMESINCELASTSOLVE » EST_FREECPU 17 Diet Scheduling � Standard Estimation Tags I nform ation tag Multi Explanation - Starts w ith EST_ value TCOMP the predicted time to solve a problem TIMESINCELASTSOLVE time since last execution start (sec) FREECPU amount of free CPU (between 0 and 1) LOADAVG CPU load average FREEMEM amount of free memory NBCPU number of available processors CPUSPEED x frequency of CPUs(MHz) TOTALMEM total memory size(Mb) BOGOMIPS x the BogoMips CACHECPU x cache size CPUs(Kb) TOTALSIZEDISK size of the partition(Mb) FREESIZEDISK amount of free place on partition(Mb) DISKACCESSREAD average time to read from disk(Mb/ s) DISKACCESSWRITE average time to write to disk(Mb/ s) ALLINFOS x [ empty] fill all possible fields 18 9

  10. DI ET Scheduling � Aggregation Methods � Defining mechanism how to sort SeD responses: associated with the service and defined at the SeD level � Tunable comparison/aggregation routines for scheduling � Priority Scheduler • Performs pairwise server estimation comparisons returning a sorted list of server responses; • Can minimize or maximize based on SeD estimations and taking into consideration the order in which the request for those performance estimations was specified at SeD level. 19 DI ET Scheduling � Co llector of R esource I nformation (CoRI) � Platform performance subsystem � Enables easy interfacing with third party performance monitoring and prediction tools � Aims � Provide basic measurements CoRI-Easy Module that are available regardless of the state of the system � Manage the simultaneous use of CoRI Manager different performance prediction systems within a single heterogeneous platform 20 10

  11. DI ET Scheduling-CoRI Manager � Access to different collectors � Modular design � Great deal of extensibility CoRI Manager Other FAST Collectors CoRI-Easy like Collector Collector Ganglia 21 CoRI collector - FAST [Martin Quinson. PhD thesis. 2003.] 22 11

  12. CoRI collector – CoRI -Easy � Resource collector that provides basic performance measurements of the SeD � Extensible like CORI Manager � Information available � CPU evaluation (nb, frequency, cache size BogoMips, load average, utilization) � Memory capacity (total size, available) � Disk performance and capacity (read-write speed, capacity, free capacity) 23 Outline 1 Grid and Grid-RPC 2 Related Work 3 DIET Overview 4 DIET Scheduling 5 Experimentation 6 Conclusion and future work 24 12

  13. Experim entation MA LA LA SeD SeD SeD SeD SeD SeD Toy platform (ENS-Lyon/France): • 7 Servers P4 2.4GHz Memory: 256Mo • 2 Servers Intel P4 XEON 2.4GHz Memory 1Go 25 CPU Experim entation � CPU Scheduler – priority scheduler that maximizes the ratio BOGOMIPS 1 + loadaverage � RR Scheduler (Round Robin) -priority scheduler that maximizes the time elapsed since the last execution start 26 13

  14. The CPU Scheduler � Request interleave time to 5 seconds � CPU scheduler 27 The CPU Scheduler � Requests interleave time to 5 seconds � Round Robin scheduler 28 14

  15. The CPU Scheduler � Requests interleave time to 10 seconds � CPU scheduler 29 The CPU Scheduler � Requests interleave time to 10 seconds � Round Robin scheduler 30 15

  16. The CPU Scheduler � Requests interleave time to 1 minute � CPU scheduler 31 The CPU Scheduler � Requests interleave time to 1 minute � RR scheduler 32 16

  17. The CPU Scheduler � CPU vs RR scheduler total computation time � 1 minute request inter-arrival time 33 The CPU Scheduler � Requests interleave time to 1 minute � CPU scheduler - average task time ~ 2 minutes 34 17

  18. The CPU Scheduler � Requests interleave time to 1 minute � CPU scheduler - average task time ~ 3 minutes 35 I / O Experim entation � I/ O Scheduler: � priority scheduler that maximizes the disk write speed � RR Scheduler (Round Robin) � priority scheduler that maximizes the time elapsed since the last execution start 36 18

  19. I / O Experim entation � Requests interleave time to 25 seconds � I/ O scheduler 37 I / O Experim entation � Requests interleave time to 25 seconds � RR scheduler 38 19

  20. I / O Experim entation � Requests interleave time to 25 seconds � RR scheduler – 100 requests 39 I / O Experim entation � Requests interleave time to 35 seconds � I/ O scheduler – 100 requests 40 20

Recommend


More recommend