X10 ● Cluster ● SSH access ● X10 on your PC ● Eclipse for X10: x10dt ● From Eclipse to the cluster
Cluster ● Access via ssh through labsrv0.math.unipd.it ● Cluster description: http://numlab.math.unipd.it/ ● labsrv0 is the submit host. The clustering software is Torque/Maui, an evolution of PBS: http://en.wikipedia.org/wiki/Portable_Batch_System/
Access via SSH ● Basic commands ssh : remote terminal connection scp : file copy between hosts ssh-keygen : criptographic keys management ● Access without password, instrucion at: http://www.linuxproblem.org/art_9.html in summary: a) local: ssh-keygen -t dsa (to generate a key couple) b) local: scp .ssh/id_dsa.pub user@remote:/tmp (public key copy) b) remote: cat /tmp/id_dsa.pub >> $HOME/.ssh/authorized_keys
X10 on your PC ● Download the archive (pre-built binaries): http://www.x10-lang.org/ ● For windows you have to install Cygwin: http://www.cygwin.com/ ● Install java: http://java.sun.com/ ● Estract the archive in a convenient directory (example: /opt/X10). ● Update PATH and JAVA_HOME enviroment variables
Installation example
'RUN' of an example
X10 IDE (x10dt) ● Downloadable from http://www.x10-lang.org ● Extract the archive in /opt , you obtain the directory /opt/x10dt ● To start the IDE: koriel@righi2:~$ /opt/x10dt/x10dt <RET>
X10dt start: workspace choose
x10dt: welcome screen
x10dt: base window
x10dt: a new project Choose 'File' menu and then 'New' and 'X10 Project (C++ back-end)'
x10dt: Hello World
x10dt: editing your code
X10dt: run your program
x10dt: remote execution ● First you need define the remote host connection ● Click near the upper right corner near to 'X10' ● Open the perspective 'Parallel Runtime'
x10dt: remote execution ● First you need define the remote host connection ● Click near the upper right corner near to 'X10' ● Open the perspective 'Parallel Runtime', the x10dt window change...
x10dt: remote execution ● In the changed x10dt window organization click with the right button of the mouse on the section 'Resource manager' ● Choose 'Add resource manager' ● Choose 'Remote Launch'
x10dt: remote execution ● In the new wizard choose 'Remote Tools' as remote service provider ● Hit 'New...' to define a new Connection name. ● You can see the necessary parameters on the image, here the username is 'koriel', you have to change with the your.
x10dt: start a resource manager
x10dt: running configuration ● When the resource manager is configured and started you can define a run configuration: ● Menu` 'Run...' → 'Run Configurations...', a new window appears. ● Click with the right mouse button on 'Parallel Application' and choose 'New'
x10dt: run: resource manager The right side of the window is now an environment to define our run. You can ● Change the 'Name' with 'Remote Hello' as in the image ● Choose the previously defined resource manager
X10dt: run: choose program ● In this window you define: The project that produced your application, here is 'Hello' ● The remote Application program to run, here 'remote-hello' stored in the directory '/work/koriel/X10' ● Where find the program to run (Path to the local file), here 'Hello' stored in '/home/koriel/x10- work/Hello/bin/Hello' ● In the 'Common' tab you can choose of display the new run configuration in the 'Run' menu` of the main window.
X10dt: remote run You can find your run configuration in the general 'Run' menu`, choosing this item you run your 'Hello' program on the remote host and see the output on the bottom of the general window: 'Hello World from place 0'
x10dt: something of parallel In the 'Run Configuration' setup choose the 'Environment' tab. Here you can define enviroment variables for your execution. The variable 'X10_NPLACES' defines the number of separate processes that your execution needs. In the image the variable has just been defined (using the 'New....' button) and a value of '4' was assigned. You can re-run your program and see four time the words 'Hello World from place X' with 'X' in 0,...,3.
x10dt: torque/maui at math.unipd.it #!/bin/sh Requirements for the cluster of the ### Department of Mathematics: ### TORQUE DEFINITIONS ### #PBS -N X10-Fibonacci #PBS -r n You need to generate on ' labsrv0 ' the #PBS -M righi@math.unipd.it #PBS -e localhost:${HOME}/X10/Fibonacci/Fibonacci.err #PBS -o localhost:${HOME}/X10/Fibonacci/Fibonacci.out public/private key and then add the public #PBS -q cluster_short #PBS -l nodes=4:ppn=4:cluster part to the labsrv0 file #PBS -l mem=1g #PBS -l walltime=1:00:00 ${HOME}/.ssh/authorized_keys ### ### COMMANDS START ### echo Working directory is $PBS_O_WORKDIR Copy the file cd $PBS_O_WORKDIR echo Running on host `hostname` /home/koriel/known_hosts.x10 echo Time is `date` echo Directory is `pwd` To your file ### Infiniband Node Conversion ${HOME}/.ssh/known_hosts echo Converting PBS_NODEFILE NEW_PBSNODEFILE=`basename $PBS_NODEFILE` This allow every machine of the cluster to /cluster/CONF/convert-ethIP-2-ibIP-X10.sh $PBS_NODEFILE $NEW_PBSNODEFILE echo MPI-Infiniband-Nodes: PBS_NODEFILE=`pwd`/${NEW_PBSNODEFILE} call every other machine. echo This jobs runs on: ### COMMAND export X10_NPLACES=`wc -l < $PBS_NODEFILE` On the right side you see an example of a export X10_HOSTLIST=`tr [:space:] , < $PBS_NODEFILE` 'job' file used to submit a job to the cluster. echo "PLACES: X10_NPLACES=${X10_NPLACES}" echo "LIST: X10_HOSTLIST=${X10_HOSTLIST}" More information on torque/maui site: echo -n "START: "; date --rfc-3339=ns http://www.adaptivecomputing.com/ cd bin x10 Fibonacci 36 echo -n "STOP: "; date --rfc-3339=ns
x10dt: run on cluster – basics 1 ● Every program in the cluster is submitted thought a file ' NAME .job '. The command on ' labsrv0 ' to submit a job is: qsub NAME .job ● This file has to be present on labsrv0 disk when we start the job. We can keep it as a file in our project, in a folder named ' PBS '. ● To run our program as a job we need transfer the results of the compilation of our code and the job file in a convenient directory on labsrv0 and issue the command ' qsub ' as described.
x10dt: run on cluster – basics 2 ● We create a project named ' Fibonacci ' (Java back-end) ● This back-end, on compilation, stores the results on the folder ' bin ' of the project ● We transfer all the ' bin ' folder on labsrv0 on the directory $HOME/X10/Fibonacci/bin and the job file in $HOME/X10/Fibonacci . The paths used in the job file reflects this facts ● These transfers are defined in the tab 'Syncronize' of the 'Run Configuration' and are executed before the true run ● After the transfers the 'Run Configuration' uses the 'qsub' program with the just transferred job file to submit to the cluster our work.
x10dt: run configuration – step 1 Create a new project with 'File','New','X10 Program (Java back-end)'. Name it 'Fibonacci'. Create a new class with 'File', 'New', 'X10 Class' named 'Fibonacci'. Copy the source from the samples distributed with X10. Create a new 'Run Configuration', Choose the resource manager in tab 'Resources' and in the 'Application' tab you have to define ' /export/alt/torque/bin/qsub ' as 'Application program'.
x10dt: run configuration – step 3 Create a new project with 'File','New','X10 Program (Java back-end)'. Name it 'Fibonacci'. Create a new class with 'File', 'New', 'X10 Class' named 'Fibonacci'. Copy the source from the samples distributed with X10. Create a new 'Run Configuration', Choose the resource manager in tab 'Resources' and in the 'Application' tab you have to define ' /export/alt/torque/bin/qsub ' as 'Application program'.
x10dt: run configuration – step 3 Define the file 'fibonacci.job' as the argument to the 'qsub' program
x10dt: run configuration – step 4 In the tab 'Synchronize' of the 'Run Configuration' you defines the uploads (and eventually downloads) that have to be performed before (after) issue the 'qsub' command. You can transfer single files or entire folder/directories and in this case we transfer the entire 'bin' folder of our project and the 'fibonacci.job' file
x10dt: run configuration – step 5 Now you can save ('Apply') the 'Run Configuration' and run it. On the bottom of the screen in the 'Console' tab you should observe A line like: 478411.grid0.math.unipd.it This is the output of the ' qsub ' program. Now you have two choiches 1) Connect via ssh to labsrv0 and use the ' qstat ' program to monitor the job 2) Define a new 'Run configuration' that runs the program /export/alt/torque/bin/qstat and – optionally – downloads the files Fibonacci.out and Fibonacci.err (defined in the 'job' file as the ouput and the error of our program) in the local 'PBS' folder to later examination.
Recommend
More recommend