1
UHD Grid Computing Team Department of Computer and Mathematical Sciences University of Houston-Downtown
UHD Grid Computing Team Department of Computer and Mathematical - - PowerPoint PPT Presentation
UHD Grid Computing Team Department of Computer and Mathematical Sciences University of Houston-Downtown 1 2 Application Projects Grid Laboratory Architecture Interface Clusters Outline Historical Review An Integrated
1
UHD Grid Computing Team Department of Computer and Mathematical Sciences University of Houston-Downtown
2
3
Java Application Hierarchical Design
Labs -> Tasks -> Activities Explorer Plug-ins to run applications
4
Solution: a lab explorer
Solution: an array of servers that run on a
5
Interface of functional units Client/server model Lab modules Module language Architecture specification
Design theme Application theme
Multi-agent system
6
7
The Main Menu Frame End Screens Cards Tree Panel Upper Toolbar Lower Toolbar Save Open Print Next
Previous
Last Help First
8
Frame title & icon Screen menu
Tree
Upper ToolBar Screen panel ScrollPane Lower ToolBar Button
9
Open or Save lab Massage
The file *.mla
10
Print layout frame Print Dialog Data Table Print Table Print Dialog Scroll Pane
11
Help Frame
12
Visual C++ MPICH JumpShot
13
Services menu Scroll pane Radio Button Text area Text field Scroll Menu
Node Calling Menu
14
15
16
http://www.netlib.org/utk/papers/mpi-book/node2.html
17
18
19
Q u i c k s o r t f o r 1 0 , 0 0 0 e l e me n t s 0 . 0 1 0 . 0 2 0 . 0 3 0 . 0 4 0 . 0 5 P r o c e s s o r s 0 . 0 4 2 0 . 0 1 4 5 0 . 0 1 2 9 0 . 0 1 5 8 0 . 0 1 9 1 p 2 p 4 p 8 p 1 6 p Q u i c k s o r t f o r 1 0 0 , 0 0 0 e l e me n t s 0 . 0 5 0 . 1 0 . 1 5 0 . 2 0 . 2 5 P r o c e s s o r s 0 . 1 6 2 0 . 1 8 1 0 . 1 9 8 0 . 1 8 9 0 . 2 2 4 1 p 2 p 4 p 8 p 1 6 p Q u i c k s o r t f o r 1 , 0 0 0 , 0 0 0 e l e me n t s 0 . 5 1 1 . 5 2 P r o c e s s o r s 1 . 4 9 1 . 6 2 1 . 5 1 1 . 3 4 1 . 3 4 1 p 2 p 4 p 8 p 1 6 p Q u i c k s o r t f o r 1 0 , 0 0 0 , 0 0 0 e l e me n t s 5 1 0 1 5 2 0 P r o c e s s o r s 1 6 . 0 8 5 1 4 . 6 8 1 4 . 5 8 1 4 . 1 5 1 2 . 4 9 1 p 2 p 4 p 8 p 1 6 p
20
M e r g e So r t - 1 0 , 0 0 0 e l e me n t s 0 . 0 1 0 . 0 2 0 . 0 3 0 . 0 4 0 . 0 5 0 . 0 6 P r o c e s s o r s 0 . 0 4 9 1 0 . 0 4 3 4 0 . 0 3 0 2 0 . 0 3 1 0 . 0 2 9 1 p 2 p 4 p 8 p 1 6 p M e r g e So r t - 1 0 0 , 0 0 0 e l e me n t s 0 . 0 5 0 . 1 0 . 1 5 0 . 2 0 . 2 5 0 . 3 P r o c e s s o r s 0 . 2 4 8 0 . 2 2 3 0 . 1 7 9 0 . 1 4 7 0 . 1 5 5 1 p 2 p 4 p 8 p 1 6 p
M e r g e s o r t - 1 0 , 0 0 0 , 0 0 0 e l e me n t s 1 0 2 0 3 0 4 0 P r o c e s s o r s 3 1 . 7 5 2 5 . 7 2 1 7 . 8 9 1 3 . 8 5 1 2 . 1 4 1 p 2 p 4 p 8 p 1 6 p M e r g e s o r t P e r f o r ma n c e I n c r e a s e f o r 1 , 0 0 0 , 0 0 0 e l e me n t s 0 . 2 0 . 4 0 . 6 0 . 8 1 1 . 2 1 . 4 P r o c e s s o r s 1 6 . 5 9 % 5 1 . 7 0 % 9 9 . 2 5 % 1 1 8 . 8 5 % 1 p 2 . 6 7 2 p 4 p 8 p 1 6 p
21
22
23
24
25
26
27
ABSTRACT
The construction and performance of computer clusters running different operating systems is studied. A platforms Windows XP cluster and a Linux ‘Beowulf’ cluster needed to be constructed to conduct a time-based analysis. Details on construction, configuration, and performance between the clusters are discussed.
INTRODUCTION
The typical Von-Neumann architecture has directed us to increase processing power via increased transistors, addressing space, and physical memory. However, a more efficient way is through message-passing between multiple processors. The concept of message-passing is to achieve parallelism through a function that explicitly transmits data from one process to another. Message Passing Interface (MPI) is simply a “library” of functions that can be called from C/C++ and FORTRAN programs. MPI programs make use of multiple processors by assigning each processor a task. Each processor works in parallel with another processor where one sends a packet of data and one receives. MPI programs are designed to operate most efficiently on multiple processors. They are used widely on Scalable Parallel Computers (SPCs) and Networks of Workstations (NOWs). A ‘cluster’ is simply a collection computers (2 or more) working in parallel to accomplish a given task. Here, two different clusters were constructed to run different sorting algorithms and sample MPI programs. For convenience, a Java applet was also developed to launch these programs. Subsequently, cluster construction and performance results are discussed.
Khoi Nguyen, Computer & Mathematical Sciences, University of Houston – Downtown Advisor: Dr. Hong Lin Fall 2004
CLUSTER CONSTRUCTION
XP Cluster 2 nodes: AMD Athlon 1.33GHz and Pentium III 850MHz w/ 512MB system memory were linked via a Router/Switch (see Figure 2).
Router/ SwitchLinux Beowulf Cluster 16 nodes: (15) Pentium II 350Mhz w/ 128MB system memory and (1) server node: Pentium 550Mhz w/ 256MB system memory. All nodes were linked via 10/100Mbps Ethernet LAN switch. A KVM switch was installed for
SORTING ALGORITHMS
Parallel implementations of Merge-sort (O(log2n)) and Bitonic-sort (O((log2n))2/2) were used to conduct the time-based
as control variables.
XP CLUSTER CONFIGURATION
The configuration for this cluster required an older protocol – NETBeui, but the more widely used protocol today is TCP/IP. The MPICH installation was mirrored on each node, and user information and passwords must be identical, and the executable program file must be in the same location on each node. Either node could function as the server at the user’s discretion; moreover, whatever node launched the program, becomes the server. MPICH ran processes in a ‘round-robin’ fashion.
BEOWULF CLUSTER CONFIGURATION
This architecture requires the installation of a Linux distribution on each node. One node functions as a server where the user interacts directly. The rest of the nodes serve as computational slaves (see Figure 3). Fedora (latest Red Hat) was installed on each node. Remote Shell or ‘rsh’ was used for communication between server and
latest MPICH distribution for UNIX was installed to each node to the same directory. Sample programs included in the distribution were tested on 8 nodes. Figure 3 Figure 2 Figure 1
Workstations
LAN Switch KVM Switch 0.00 1.14 2.28 3.42 4.56 5.70 6.84 7.98 9.12 10.26 11.40 12.54 13.68 14.82 15.96 17.10 18.24 19.38 20.52 21.66 22.80 23.94 25.08 26.22 27.36 28.50 29.64 30.78 31.92 33.06 34.20 35.34 36.48 10000 100000 1000000 10000000ELEMENTS Time (s)
Serial ParallelXP Cluster – Mergesort – 2 Processors
0.942 0.65 0.468 0.367 0.315 0.356 0.354 0.257 0.229 0.307 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1p 2p 3p 4p 5p 6p 7p 8p 9p 10p PROCESSORS TIME (s)CPI Test Program - Beowulf TO BE CONTINUED…
Inconclusive and inconsistent data drawn from the XP cluster has led me to choose the Beowulf design. MPI programs are competing for resources under XP, so priority scheduling is required for running MPI programs efficiently under XP. This is a rather cumbersome and inconvenient process; moreover the Beowulf cluster offers more flexibility in
continued research is currently being conducted on the Beowulf cluster.
Server
REFERENCES
Pacheco, Peter. Pacheco, Peter. Parallel Programming with MPI. Parallel Programming with MPI. Morgan Kaufmann, 1997. Morgan Kaufmann, 1997. Parallel Programming with MPI. Parallel Programming with MPI. Pacheco, Peter. Pacheco, Peter.
www.cs.usfca.edu/mpi/> />
28
29
30
31
Coordinate a network of traffic lights in an inner city streets such
as downtown can be very challenging
The fact that traffic configurations change constantly One might have to wait for the light to turn green at an
intersection where there is no car on the cross street
How long to keep the light green at a busy street? Timed traffic lights Sensors
32
33
Consists of two types of
agents Communication Agents -- send and receive traffic information from and to its neighbors Computation Agents -- make decision to keep or change the current traffic light based on the information from Communication Agent
34
Number of cars passing at the intersection Number of cars waiting at the intersection Time of light has stay green Waiting time for a red light to change to green Average speed of current traffic
35
36
37
containing a total of 100 traffic lights.
(moderate) and 500 (light traffic).
street or direction.
the timed traffic lights with a checkerboard pattern (no adjacent
intersections have the same light),
Simple traffic controllers. Each controller based its decision on
the amounts of cars on the two sides of the intersection, the maximum waiting time that any car can be made to wait at a red
light.
38
Intelligent Agent Traffic Controllers Standard Traffic Controllers
39
Standard traffic lights Simple intelligent agent controlled traffic lights
IA traffic lights are 20% to 90% more efficient
40
41
42
Datamining Runtime
100 200 300 1 8 16 # of Nodes Runtime
Datamining Speedup
1 2 3 4 1 8 16 # of Nodes Speedup
43
Command line Network socket – to service web requests
44
Maintenance Agent Control Agent Student Information Agent Instructor Notification and Recommendation Agent Control Agent Control Agent Master Control Agent Link DB Student Registrar Student DB
45
Main entry point into the system
Instructor
Student
Registrar
46
Process creation limitation Eliminates 1:Many relationship between control and
47
JOB INPUT JOB
struct JOB{ int cmdSource; int cmdDestination; int cmdCode; int cmdResult; JOB_SOURCE ioSource; Socket* jobSocket; int messageLength; char messageInfo[MAX_STRING_SIZE]; }; struct INPUTJOB{ int jobType; int cmdCode; Socket* jobSocket; char commandText[MAX_STRING_SIZE]; };
48
Contains MPI communication and process creation
All agent classes inherit from this class
Demos for each agent class represented
CGI program used to communicate with the Agent
49
The ultimate goal of this project is to formulate a formal system for creating multi-agent systems (MAS) so that one no longer has to rely on the use of a high level specification language. This will be accomplished by creating a gamma calculus parser and running the parser on a prototype to formulate a method for a formal system of creating multi-agent systems. As it stands, a prototype E-Learning MAS has been created and a preliminary Beliefs- Desires-Intentions (BDI) model, using argumentation based negotiation, has been created.
Methods of implementation are as follows: 1.First it was imperative to create a model MAS to run the calculus parser on. The chosen model was an E-Learning Environment MAS. This model was built using four main agents to distribute tasks; Master- Control Agent, Student Agent, Instructor Agent, Registrar Agent. These agents will handle registration and enrollment of students and the managing of course content. 2.Second is to create advanced logic to run with these agents. The logic chosen was argumentation based negotiation. Using this with a BDI model, the agents would argue among themselves to achieve their particular goals. What is important about this model is each agent will argue for its beliefs and if other agents are coerced, they will create a compromise among themselves. 3.The third point is to create a gamma-calculus parser to run on the MAS created. This will allow data to be collected and interpreted to formulate a method for the development of a formal system of creating multi-agent systems.
Successful completion of the multi-agent system prototype has been accomplished. Using MPI, the MAS divides tasks and sends the task to be accomplished by the appropriate agent. This system runs concurrently with a server/client socket structure. The Master-Control agent handles information received from the server socket which waits for a client on the same machine to communicate. This client gathers its information through use of Apache and Java Server pages. Thus the client of this system is a simple web page in which a user enters data to be used by the MAS. The argumentation based negotiation logic with the BDI model has been preliminarily
argues beliefs of classes that cannot be taught and classes that must be taught. When the system begins, the instructor argues what class it wants and the registrar responds, arguing what changes it may need to make. This is similar to what goes on with the student and registrar, in which the student has a list of desired classes and must argue to allow the registrar to accept its proposal. Current work is to make it more complex and add a visual aid to the program. All of this was done using Jason with agent-speak.
55
Z X Y
refinement
Step 1 Step 2 Step 3 Step 4
56
57
58
P3DR Runtime
500 1000 1500 1 2 4 8 16 # of Nodes Runtime
P3DR Speedup
2 4 6 8 1 2 4 8 16 # of Nodes Speedup
59
60
61