Condo Cluster: Managing jobs using the PBS job scheduler


How to create, submit, monitor, and delete jobs using the PBS job scheduler

   
1) Login in to Condo Cluster using the ssh command

2) Make sure that you have the correct PATH by issuing  "printenv | grep PATH"
   on condo.  Your path should include at least one occurrence of  /usr/local/bin
   If this is not the case, send email to hpc-help@iastate.edu.

3) Create a PBS script using the PBS scriptwriter and save in a file named "myscript", 
   or whatever name you like. Since the architecture of Condo Cluster is different from other
   machines, scripts for these machines should not be used on Condo Cluster. When selecting
   the Time needed and Number of nodes needed refer to the current queue structure by issuing
     qstat -q

4) qsub myscript 
     This command submits the PBS script in the file myjob. You may submit several
     jobs in succession if they use different output files. Jobs will be scheduled
     for queues based on the resources requested.  Job queues limit the number of
     simultaneous jobs by a single user and a single group.
     

          PBS scripts must be "visible" to the qsub command, i.e. one must either issue "qsub myscript" from 
          the directory where "myscript" is located or issue "qsub /myscript".

          Usually people keep PBS scripts in the same directory where the executable and the initial data are 
          located. In this case the easiest would be to submit job from that directory and within the script to cd 
          there by issuing "cd $PBS_O_WORKDIR". 

      To use special nodes (such as fat and huge), select the appropriate node type in the script generator
      and submit generated script into appropriate queue: qsub -q fat myscript , see "Using Special Nodes" section 
      for more information.

5) qstat -q
     Gives the status of all the queues and the current queue structure.

6) qstat -a
     This shows all jobs in the system that are running and waiting in a queue to run.  
     If you don't see your job, it has completed execution.  The script generated by
     the PBS scriptwriter puts output in file BATCH_OUTPUT and the error file 
     in BATCH_ERRORS.  For more information, read the documentation produced by the 
     PBS scriptwriter.

7) qstat -r
     This shows all jobs that are currently running.

8) qdel job#
     If your PBS job has job# 15244, then issuing "qdel 15244" will delete the job from the PBS job queue and
     if the job is running, its execution will be terminated.