Curso OAR
index next | previous | IGRIDA »
Logo Table Of Contents
Interactive job using 1 core Interactive job using n cores Interactive job on 2 compute nodes Batch job using one single core Interactive parallel run on 1 compute node, using MPI Interactive parallel run on 2 compute node, using MPI (idem) Batch parallel run on 2 compute node, using MPI Simple job array of multi-threaded runs, using OpenMP Parametric job array of sequential runs Further reading
Previous topic
Introduction Next topic
1. Grid segmentation This Page
Show Source
Quick search
Enter search terms or a module, class or function name.
In this OAR tutorial section, you will experiment how to launch:
an interactive job using 1 core (default) an interactive job using n cores an interactive job on 2 compute nodes (useful for MPI development & debugging, not for production!) a batch job using one single core an interactive parallel run on 1 compute node, using MPI (useful for development & debugging, not for production!) an interactive parallel run on 2 compute node, using MPI (idem) a batch parallel run on 2 compute node, using MPI (appropriate for production jobs) a simple job array of multi-threaded runs, using OpenMP a parametric job array of sequential runs
Please take notice of some useful OAR features that are not yet covered in this tutorial (see reading advices below):
best effort jobs job containers (useful e.g. for organizing training sessions) etc.
Interactive job using 1 core
To launch an interactive job using 1 core (default), just type:
> oarsub -I [ADMISSION RULE] Modify resource description with type constraints [ADMISSION RULE] Set default walltime to 10 minutes. [ADMISSION RULE](15) stdout : /temp_dd/igrida-fs1/scampion/OAR_%jobid%.stdout [ADMISSION RULE](15) stderr : /temp_dd/igrida-fs1/scampion/OAR_%jobid%.stderr Generate a job key... OAR_JOB_ID=17036 Interactive mode : waiting... Starting...
Connect to OAR job 17036 via the node igrida05-01.irisa.fr
You are automatically brought to one interactive node where you only have access to one core of the node. In the informations echoed by oarsub, the default walltime is indicated (in minutes, as configured by default on the cluster).
To then visualize and monitor your jobs, please have a look at the following links:
The Monika web page. This page provides an overview of the cluster load (compute nodes). Running jobs are highlighted, whereas unused resources are indicated as “Free”. You can recognize your own job ID in one of the highlighted cluster cells. More informations are printed by clicking on this cell. The DrawGantt web page. This page gives an overview of all submitted jobs, along with their estimated scheduling time (for jobs not yet running). The current time is indicated with a vertical red line. You must find your own one core interactive job somewhere on this Gantt diagram, and some details are provided by placing the mouse cursor over any depicted job. Note that each line represents one individual core in this Gantt diagram. These lines are grouped by CPU, and CPUs are grouped by hostnames.
To terminate your interactive job, just type exit in your terminal window:
> exit Connection to igrida05-01.irisa.fr closed. Disconnected from OAR job 17036
Interactive job using n cores
You may proceed similarly to reserve n cores for an interactive job. If n is big enough, the batch system will reserve cores on several nodes for you. For instance, to reserve 6 cores,
oarsub -I -l /core=6 ... ... igrida02-09%
When the system attributes resources which are distributed on several nodes, you are interactively brought on one of these nodes. The detailed list of your 6 cores may be obtained by typing:
igrida02-09%cat $OAR_NODEFILE igrida02-09.irisa.fr igrida02-09.irisa.fr igrida02-10.irisa.fr igrida02-10.irisa.fr igrida02-10.irisa.fr igrida02-10.irisa.fr
You can connect to one any the other nodes listed in this file with the oarsh command. Doing this, note that you don’t need to give any password, since ssh keys are automatically installed by OAR over your job lifetime:
oarsh igrida02-10 ... exit #to leave this ssh connection
Have a look at the Monika and DrawGantt pages (do not forget to refresh them) to visualize your new interactive job.
Warning
Please always remember to leave any OAR interactive job just by typing exit !
exit
to leave your interactive session. Interactive job on 2 compute nodes
Useful for MPI development & debugging, not for production!!!
Staying logged into the front-end node, you can now try to reserve 2 full compute nodes for an interactive job (typically for MPI debugging):
oarsub -I -l /nodes=2 ... ... cat $OAR_NODEFILE
Do not forget to leave this interactive job with the simple exit command:
exit
Batch job using one single core
Let us now launch a first batch job, on a single core.
As a good practice, we recommand you to create a SCRATCHDIR directory, located in /temp_dd/igrida-fs-1/..., where the run will take place. In particular, your run outputs shoud never be written to your home directory, which would necessarily drive to a NFS crash at some time. For instance, type:
mkdir -p /temp_dd/igrida-fs1/$USER/SCRATCH/
and create the SCRATCHDIR variable in your environment files. For example, put the line:
setenv SCRATCHDIR /temp_dd/igrida-fs1/$USER/SCRATCH
in your $HOME/.cshrc_perso (in the IRISA network). For this new environment variable to be taken into account, just type:
source $HOME/.cshrc
An equivalent brute force way to proceed is to leave your front-end session (with the exit command) and connect again to igrida. Anyway, to then check that this new variable was properly taken into account:
echo $SCRATCHDIR
Then copy the following first-job-with-oar.sh example script somewhere in your home directory, and substitute my_login by your own LDAP identifier:
- !/bin/sh
- OAR -l core=1,walltime=00:05:00
- OAR -O /temp_dd/igrida-fs1/my_login/SCRATCH/fake_job.%jobid%.output
- OAR -E /temp_dd/igrida-fs1/my_login/SCRATCH/fake_job.%jobid%.error
set -xv
echo echo OAR_WORKDIR : $OAR_WORKDIR echo echo "cat \$OAR_NODE_FILE :" cat $OAR_NODE_FILE echo
echo "
- Where will your run take place ?
- * It is NOT recommanded to run in $HOME/... (especially to write),
- but rather in /temp_dd/igrida-fs1/...
- Writing directly somewhere in $HOME/... will necessarily cause NFS problems at some time.
- Please respect this policy.
- * The program to run may be somewhere in your $HOME/... however
-
"
TMPDIR=$SCRATCHDIR/$OAR_JOB_ID mkdir -p $TMPDIR cd $TMPDIR
- EXECUTABLE=$HOME/some/where/my_progam.exe
echo "pwd :" pwd
echo echo "=============== RUN ==============="
- -- FAKE RUN EXECUTION
echo "Running ..." sleep 60 # fake job, 1 minute
- -- FAKE RUN OUTPUTS
cat > my_program_summary.out <<EOF For example, some short solver statistics are summarized here. 1.e-10 1.e-13 1.e-14 1.e-16 Converged EOF
echo "Done" echo "==================================="
- -- ECHO SOME SUMMARY OUTPUTS OF THE RUN IN THE ***.output FILE
echo echo "cat my_program_summary.out" echo "---------------------" cat my_program_summary.out echo "---------------------" echo echo OK
Make this script executable:
chmod u+x first-job-with-oar.sh
Then, your are ready to submit the job to OAR in batch mode:
oarsub -S first-job-with-oar.sh [ADMISSION RULE] Modify resource description with type constraints Generate a job key... OAR_JOB_ID=1679
To check your job status,
oarstat
or with more verbose outputs:
oarstat -f
To watch your run progress, you may look at the job output file:
tail -n 100 -f $SCRATCHDIR/fake_job.nnnn.output
where you need to susbtitute nnnn by the OAR_JOB_ID echoed by your previous oarsub command.
Note that the “-f” option causes tail to not stop when end of file is reached, but rather to wait for additional data to be appended to the file. Therefore, you need to type Control-C in the terminal window to finish this command.
In a similar manner, you can have a look at the job error file:
cat $SCRATCHDIR/fake_job.nnnn.error
Refresh your Monika and DrawGantt web pages to visualize your batch job scheduling.
In case you want to delete your job, just type:
oardel nnnn
with nnnn as above. Interactive parallel run on 1 compute node, using MPI
Useful for development & debugging, not for production!
Let’s now launch a simple interactive parallel run on 1 node using MPI.
oarsub -I -p "infiniband='YES'" -l nodes=1
Copy the following hello-world-mpi.c example (wikipedia example).
/*
* "Hello World" MPI Test Program * */ #include <mpi.h> #include <stdio.h> #include <string.h> #define BUFSIZE 128 #define TAG 0 int main(int argc, char *argv[]) { char idstr[32]; char buff[BUFSIZE]; int numprocs; int myid; int i; MPI_Status stat; MPI_Init(&argc,&argv); /* all MPI programs start with MPI_Init; all 'N' processes exist thereafter */ MPI_Comm_size(MPI_COMM_WORLD,&numprocs); /* find out how big the SPMD world is */ MPI_Comm_rank(MPI_COMM_WORLD,&myid); /* and this processes' rank is */ /* At this point, all programs are running equivalently, the rank distinguishes * the roles of the programs in the SPMD model, with rank 0 often used specially... */ if(myid == 0) { printf("%d: We have %d processus\n", myid, numprocs); for(i=1;i<numprocs;i++) { sprintf(buff, "Hello %d! ", i); MPI_Send(buff, BUFSIZE, MPI_CHAR, i, TAG, MPI_COMM_WORLD); } for(i=1;i<numprocs;i++) { MPI_Recv(buff, BUFSIZE, MPI_CHAR, i, TAG, MPI_COMM_WORLD, &stat); printf("%d: %s\n", myid, buff); } } else { /* receive from rank 0: */ MPI_Recv(buff, BUFSIZE, MPI_CHAR, 0, TAG, MPI_COMM_WORLD, &stat); sprintf(idstr, "Processus %d ", myid); strncat(buff, idstr, BUFSIZE-1); strncat(buff, "reporting for duty\n", BUFSIZE-1); /* send to rank 0: */ MPI_Send(buff, BUFSIZE, MPI_CHAR, 0, TAG, MPI_COMM_WORLD); } MPI_Finalize(); /* MPI Programs end with MPI Finalize; this is a weak synchronization point */ return 0; }
Load our MPI environment module
module load openmpi
To compile and link your code with the MPI library, do NOT directly use the cc compiler, but rather the mpicc wrapper instead:
mpicc ./hello-world-mpi.c -o hello-world-mpi
To now run your code using MPI, proceed through the following steps.
Have a look at the nodes which OAR reserved for you:
cat $OAR_NODEFILE
You may attempt to run your code as usually by typing:
./hello-world-mpi
This works, but such a call does NOT use the MPI library however, and just runs the executable in a sequential way. To run with MPI, you must use the mpirun wrapper.
For example, in your current interactive session, you have reserved one node (4 cores). You may just run using 2 processes e.g.: you will only be using 2 cores then, the 2 others remain idle:
mpirun -np 2 ./hello-world-mpi
To fully use your 4 cores, it is therefore recommanded, in this case, to type:
mpirun -np 4 ./hello-world-mpi
You may even ask for more processes than the number of available cores:
mpirun -np 10 ./hello-world-mpi
Note that in that case, your 10 processes will concurrently run on the 4 cores of the node, and you would have bad performances in a real situation.
Now leave your interactive job:
exit
Interactive parallel run on 2 compute node, using MPI (idem)
Let’s now launch the same MPI example on 2 nodes, interactively (useful for debugging, not for production). Note that the allowed wallclock for interactive sessions is quite short. To reserve 2 nodes for an interactive session, type:
oarsub -I -l nodes=2 -p "infiniband='YES'"
and then:
module load openmpi mpirun -machinefile $OAR_NODEFILE ./hello-world-mpi
Do not worry about the warning messages. Your code has indeed run on 8 cores, and over 2 nodes. Batch parallel run on 2 compute node, using MPI
Appropriate for production jobs
Let’s now launch the previous job in batch mode instead of an interactive session (more appropriate for production jobs). Copy the following second-job-with-oar.sh script example somewhere in your home directory:
- !/bin/sh
- OAR -l nodes=2,walltime=00:03:00
- OAR -O /temp_dd/igrida-fs1/my_login/SCRATCH/test_mpi_2nodes.%jobid%.output
- OAR -E /temp_dd/igrida-fs1/my_login/SCRATCH/test_mpi_2nodes.%jobid%.error
set -xv
TMPDIR=$SCRATCHDIR/$OAR_JOB_ID mkdir -p $TMPDIR cd $TMPDIR
echo $OAR_NODEFILE : cat $OAR_NODEFILE echo
EXECUTABLE=$OAR_WORKDIR/hello-world-mpi # or put absolute path
cd $TMPDIR
echo "============= MPI RUN =============" mpirun --mca plm_rsh_agent "oarsh" -machinefile $OAR_NODEFILE $EXECUTABLE echo "==================================="
echo OK
Substitute my_login by your own LDAP identifier, and make the script executable by typing:
chmod u+x second-job-with-oar.sh
and then submit your job in batch mode (i.e. with the “-S” option of the oarsub command):
oarsub -S second-job-with-oar.sh
Check the output file as previously:
cat $SCRATCHDIR/test_mpi_2nodes.nnnn.output
where nnnn denotes the job ID echoed by the oarsub command. Simple job array of multi-threaded runs, using OpenMP
To conclude this “First steps” section, let’s play with job arrays. Two types of job arrays are available within OAR, either simple or parametric job arrays.
Assume you want to run a simple job array of multi-threaded runs. At first, copy the following hello-world-omp.c source code:
- include <omp.h>
- include <stdio.h>
- include <stdlib.h>
int main (int argc, char *argv[]) {
int th_id, nthreads; #pragma omp parallel private(th_id) { th_id = omp_get_thread_num(); printf("Hello World from thread %d\n", th_id); sleep(60); #pragma omp barrier if ( th_id == 0 ) { nthreads = omp_get_num_threads(); printf("There are %d threads\n",nthreads); } } return EXIT_SUCCESS;
}
and compile it:
cc hello-world-omp.c -fopenmp -o hello-world-omp
Copy the associated array-job-with-oar.sh job script, substitute my_login and make the shell script executable (just as done before):
- !/bin/sh
- OAR -l core=2,walltime=00:05:00
- OAR --array 10
- OAR -O /temp_dd/igrida-fs1/my_login/SCRATCH/array_job.%jobid%.output
- OAR -E /temp_dd/igrida-fs1/my_login/SCRATCH/array_job.%jobid%.error
set -xv
echo echo OAR_WORKDIR : $OAR_WORKDIR echo echo OAR_JOB_ID : $OAR_JOB_ID echo echo "cat \$OAR_NODE_FILE :" cat $OAR_NODE_FILE echo
TMPDIR=$SCRATCHDIR/$OAR_JOB_ID mkdir -p $TMPDIR cd $TMPDIR
EXECUTABLE=$OAR_WORKDIR/hello-world-omp # you may put the absolute path here
echo "pwd :" pwd
echo echo "=============== RUN ===============" echo "Running ..." export OMP_NUM_THREADS=2 $EXECUTABLE echo "Done" echo "===================================" echo
cat > my_program_summary.out <<EOF Job array, current job with OAR_JOB_ID $OAR_JOB_ID Each job of this array ran on 2 cores (using OpenMP)
For example, some short solver statistics are summarized here. 1.e-10 1.e-13 1.e-14 1.e-16 Converged EOF
- -- ECHO SOME SUMMARY OUTPUTS OF THE RUN IN THE ***.output FILE
echo echo "cat my_program_summary.out" echo "---------------------" cat my_program_summary.out echo "---------------------" echo echo OK
Note, in this file, the following lines, with:
the OAR number of cores and the OAR array directive, the definition of the EXECUTABLE variable, pointing to the multi-threaded program, the OMP_NUM_THREADS variable (which must be coherent with the number of cores you asked for), the OAR_JOB_ID echoed in the output file (corresponding to the current task of the job array)
- OAR -l core=2,walltime=00:05:00
- OAR --array 10
... EXECUTABLE=$OAR_WORKDIR/hello-world-omp ... export OMP_NUM_THREADS=2 ... cat > my_program_summary.out <<EOF Job array, current job with OAR_JOB_ID $OAR_JOB_ID Each job of this array ran on 2 cores (using OpenMP) ...
We are now ready to launch an array of 10 jobs, each of them being multi-threaded, and using 2 cores. To proceed, submit your job as before:
igrida02-02% oarsub -S ./array-job-with-oar.sh [ADMISSION RULE] Modify resource description with type constraints Generate a job key... Generate a job key... Generate a job key... Generate a job key... Generate a job key... Generate a job key... Generate a job key... Generate a job key... Generate a job key... Generate a job key... OAR_JOB_ID=1813 OAR_JOB_ID=1814 OAR_JOB_ID=1815 OAR_JOB_ID=1816 OAR_JOB_ID=1817 OAR_JOB_ID=1818 OAR_JOB_ID=1819 OAR_JOB_ID=1820 OAR_JOB_ID=1821 OAR_JOB_ID=1822 OAR_ARRAY_ID=1813
Notice that each task of the array has an OAR_JOB_ID, exactly like standard jobs we’ve seen before, but the array itself is identified by an additional OAR_ARRAY_ID variable (corresponding with the OAR_JOB_ID of the first task). To get specific information about your array, you can use the –array option of oarstat:
oarstat --array
Refresh your Monika and DrawGantt web pages to monitor your job life cycle. Parametric job array of sequential runs
As a last use case, we consider you want to run a parametric job array of sequential runs. By using parametric jobs you’ll create as many jobs as parameter lines in your file, and each parameter line will be handled as arguments of your job executable (see this discussion).
To create a sample parameter file, you can type:
cat > $SCRATCHDIR/param-file.txt <<EOF
- this is a parameter file to be used within a parametric job array
- ==> a subjob with one single parameter
100
- ==> a subjob without parameters
""
- ==> a subjob with a multiple parameters
12.34 44.67 3.14 -10.56 EOF
Now copy the following program-with-arguments.c source code:
- include <stdio.h>
main(int argc, char** argv) {
int i; printf("Program runs with the following arguments\n"); printf("Nb. of arguments = %d\n", argc-1); for (i = 1; i < argc; i++) printf("argv[%d] = \"%s\"\n", i, argv[i]); sleep(60);
}
and compile it:
cc ./program-with-arguments.c -o program-with-arguments
Now copy the array-parametric-job-with-oar.sh script, substitute my_login, make the script executable:
- !/bin/sh
- OAR -l core=1,walltime=00:05:00
- OAR --array-param-file /temp_dd/igrida-fs1/my_login/SCRATCH/param-file.txt
- OAR -O /temp_dd/igrida-fs1/my_login/SCRATCH/param_array_job.%jobid%.output
- OAR -E /temp_dd/igrida-fs1/my_login/SCRATCH/param_array_job.%jobid%.error
set -xv
TMPDIR=$SCRATCHDIR/$OAR_JOB_ID mkdir -p $TMPDIR cd $TMPDIR
EXECUTABLE=$OAR_WORKDIR/program-with-arguments # you may put the absolute path here
echo echo "=============== RUN ===============" echo "Running ..." $EXECUTABLE $* echo "Done" echo "==================================="
You are ready to submit:
igrida02-02% oarsub -S ./array-parametric-job-with-oar.sh [ADMISSION RULE] Modify resource description with type constraints Generate a job key... Generate a job key... Generate a job key... OAR_JOB_ID=1852 OAR_JOB_ID=1853 OAR_JOB_ID=1854 OAR_ARRAY_ID=1852
Refresh your web monitoring tools, and have a look at the job output and error files to see what happens.
After these first steps, you shoud be ready to proceed further and run your own executable programs, either in sequential or in parallel, and using several cluster nodes if necessary. Some reading advices are indicated below, which will help you to get deeper insight in the OAR functionalities and internal mechanisms. At first, we recommand you to watch the “OAR presentation” movie (40 minutes, in french) for an overview of the OAR tool. Further reading
To become more familiar with the OAR batch system, please visit the official OAR website . We particularly recommand you to have a look at the following documents:
The OAR user documentation on the OAR website The “First user steps” guide on the OAR website Documentation on the Grid‘5000 website (you need a Grid‘5000 account to access documentation pages) A “OAR presentation” video (in french): OAR: Un gestionnaire de ressources pour grandes grappes de calcul There is an oar-users mailing list to which you may subscribe, with general discussions and questions about OAR.
index next | previous | IGRIDA »
© Copyright 2013, SED Rennes, IRISA - INRIA.