Installing divvy
Divvy is automatically installed when you install looper. See if your install worked by calling divvy -h
on the command line. If the divvy
executable in not in your $PATH
, append this to your .bashrc
or .profile
(or .bash_profile
on macOS):
export PATH=~/.local/bin:$PATH
Initial configuration
On a fresh install, divvy
comes pre-loaded with some built-in compute packages, which you can explore by typing divvy list
. If you need to tweak these or create your own packages, you will need to configure divvy manually. Start by initializing an empty divvy
config file:
export DIVCFG="divvy_config.yaml"
divvy init $DIVCFG
This init
command will create a default config file, along with a folder of templates.
The divvy write
and list
commands require knowing where this genome config file is. You can pass it on the command line all the time (using the -c parameter), but this gets old. An alternative is to set up the $DIVCFG environment variable. Divvy will automatically use the config file in this environmental variable if it exists. Add this line to your .bashrc
or .profile
if you want it to persist for future command-line sessions. You can always specify -c if you want to override the value in the $DIVCFG variable on an ad-hoc basis:
export DIVCFG=/path/to/divvy_config.yaml
The divvy configuration file
At the heart of divvy
is a the divvy configuration file, or DIVCFG
for short. This is a yaml
file that specifies a user's available compute packages. Each compute package represents a computing resource; for example, by default we have a package called local
that populates templates to simple run jobs in the local console, and another package called slurm
with a generic template to submit jobs to a SLURM cluster resource manager. Users can customize compute packages as much as needed.
Configuration file priority lookup
When divvy
starts, it checks a few places for the DIVCFG
file. First, the user may may specify a DIVCFG
file when invoking divvy
either from the command line or from within python. If the file is not provided, divvy
will next look file in the $DIVCFG
environment variable. If it cannot find one there, then it will load a default configuration file with a few basic compute packages. We recommend setting the DIVCFG
environment variable as the most convenient use case.
Customizing your configuration file
The easiest way to customize your computing configuration is to edit the default configuration file. To get a fresh copy of the default configuration, use divvy init custom_divvy_config.yaml
. This will create for you a config file along with a folder containing all the default templates.
Here is an example divvy
configuration file:
compute_packages:
default:
submission_template: templates/local_template.sub
submission_command: sh
local:
submission_template: templates/local_template.sub
submission_command: sh
develop_package:
submission_template: templates/slurm_template.sub
submission_command: sbatch
partition: develop
big:
submission_template: templates/slurm_template.sub
submission_command: sbatch
partition: bigmem
The sub-sections below compute_packages
each define a compute package that can be activated. Divvy
uses these compute packages to determine how to submit your jobs. If you don't specify a package to activate, divvy
uses the package named default
. You can make your default whatever you like. You can activate any other compute package on the fly by calling the activate_package
function from python, or using the --package
command-line option.
You can make as many compute packages as you wish, and name them whatever you wish. You can also add whatever attributes you like to the compute package. There are only two required attributes: each compute package must specify the submission_command
and submission_template
attributes.
The submission_command
attribute
The submission_command
attribute is the string your cluster resource manager uses to submit a job. For example, in our compute package named develop_package
, we've set submission_command
to sbatch
. We are telling divvy that submitting this job should be done with: sbatch submission_script.txt
.
The submission_template
attribute
Each compute package specifies a path to a template file (submission_template
). The template file provides a skeleton that divvy
will populate with job-specific attributes. These paths can be relative or absolute; relative paths are considered relative to the DIVCFG file. Let's explore what template files look like next.
Template files
Each compute package must point to a template file with the submission_template
attribute. These template files are typically stored relative to the divvy
configuration file. Template files are taken by divvy
, populated with job-specific information, and then run as scripts. Here's an example of a generic SLURM template file:
#!/bin/bash
#SBATCH --job-name='{JOBNAME}'
#SBATCH --output='{LOGFILE}'
#SBATCH --mem='{MEM}'
#SBATCH --cpus-per-task='{CORES}'
#SBATCH --time='{TIME}'
#SBATCH --partition='{PARTITION}'
#SBATCH -m block
#SBATCH --ntasks=1
echo 'Compute node:' `hostname`
echo 'Start time:' `date +'%Y-%m-%d %T'`
srun {CODE}
Template files use variables (e.g. {VARIABLE}
), which will be populated independently for each job. If you want to make your own templates, you should check out the default templates (in the submit_templates folder). Many users will not need to tweak the template files, but if you need to, you can also create your own templates, giving divvy
ultimate flexibility to work with any compute infrastructure in any environment. To create a custom template, just follow the examples. Then, point to your custom template in the submission_template
attribute of a compute package in your DIVCFG
config file.
Resources
You may notice that the compute config file does not specify resources to request (like memory, CPUs, or time). Yet, these are required in order to submit a job to a cluster. Resources are not handled by the divcfg file because they not relative to a particular computing environment; instead they vary by pipeline and sample. As such, these items should be provided elsewhere.