SlurmConnector

The Slurm connector allows offloading execution to High-Performance Computing (HPC) facilities orchestrated by the Slurm queue manager. It extends the QueueManagerConnector, which inherits from the ConnectorWrapper interface, allowing users to offload jobs to local or remote Slurm controllers using the stacked locations mechanism. The HPC facility is supposed to be constantly active, reducing the deployment phase to deploy the inner connector (e.g., to create an SSHConnection pointing to an HPC login node).

Warning

Note that in StreamFlow v0.1, the QueueManagerConnector directly inherited from the SSHConnector at the implementation level. Consequently, all the properties needed to open an SSH connection to the HPC login node (e.g., hostname, username, and sshKey) were defined directly in the QueueManagerConnector. This path is still supported by StreamFlow v0.2, but it is deprecated and will be removed in StreamFlow v0.3.

Interaction with the Slurm scheduler happens through a Bash script with #SLURM directives. Users can pass the path of a custom script to the connector using the file attribute of the SlurmService configuration. This file is interpreted as a Jinja2 template and populated at runtime by the connector. Alternatively, users can pass Slurm options directly from YAML using the other options of a SlurmService object.

As an example, suppose to have a Slurm template script called sbatch.sh, with the following content:

#!/bin/bash

#SBATCH --nodes=1
#SBATCH --partition=queue_name
#SBATCH --mem=1gb

{{streamflow_command}}

A Slurm deployment configuration which uses the sbatch.sh file to spawn jobs can be written as follows:

deployments:
  slurm-example:
    type: slurm
    config:
      services:
        example:
          file: sbatch.sh

Alternatively, the same behaviour can be recreated by directly passing options through the YAML configuration, as follows:

deployments:
  slurm-example:
    type: slurm
    config:
      services:
        example:
          nodes: 1
          partition: queue_name
          mem: 1gb

Being passed directly to the sbatch command line, the YAML options have higher priority than the file-based ones.

Warning

Note that the file property in the upper configuration level, i.e., outside a service definition, is still supported in StreamFlow v0.2, but it is deprecated and will be removed in StreamFlow v0.3.

The unit of binding is the entire HPC facility. In contrast, the scheduling unit is a single job placement in the Slurm queue. Users can limit the maximum number of concurrently placed jobs by setting the maxConcurrentJobs parameter.

properties
checkHostKey	(Deprecated. Use the wraps directive to wrap a standalone SSH connector.) Perform a strict validation of the host SSH keys (and return exception if key is not recognized as valid)
	type	boolean
	default	True
dataTransferConnection	(Deprecated. Use the wraps directive to wrap a standalone SSH connector.) Sometimes HPC clusters provide dedicated hostnames for large data transfers, which guarantee a higher efficiency for data movements
	type	string
		SSHConnection
file	(Deprecated. Use services.) Path to a file containing a Jinja2 template, describing how the StreamFlow command should be executed in the remote environment
	type	string
hostname	(Deprecated. Use the wraps directive to wrap a standalone SSH connector.) Hostname of the HPC facility
	type	string
maxConcurrentJobs	Maximum number of jobs concurrently scheduled for execution on the Queue Manager
	type	integer
	default	1
maxConcurrentSessions	(Deprecated. Use the wraps directive to wrap a standalone SSH connector.) Maximum number of concurrent session to open for a single SSH client connection
	type	integer
	default	10
maxConnections	(Deprecated. Use the wraps directive to wrap a standalone SSH connector.) Maximum number of concurrent connection to open for a single SSH node
	type	integer
	default	1
passwordFile	(Deprecated. Use the wraps directive to wrap a standalone SSH connector.) Path to a file containing the password to use for authentication
	type	string
pollingInterval	Time interval (in seconds) between consecutive termination checks
	type	integer
	default	5
services	Map containing named configurations of Slurm submissions. Parameters can be either specified as #SBATCH directives in a file or directly in YAML format.
	type	object
	patternProperties
	^[a-z][a-zA-Z0-9._-]*$	SlurmService
sshKey	(Deprecated. Use the wraps directive to wrap a standalone SSH connector.) Path to the SSH key needed to connect with Slurm environment
	type	string
sshKeyPassphraseFile	(Deprecated. Use the wraps directive to wrap a standalone SSH connector.) Path to a file containing the passphrase protecting the SSH key
	type	string
transferBufferSize	(Deprecated. Use the wraps directive to wrap a standalone SSH connector.) Buffer size allocated for local and remote data transfers
	type	integer
	default	64kiB
tunnel	(Deprecated. Use the wraps directive to wrap a standalone SSH connector.) External SSH connection parameters for tunneling
	type	SSHConnection
username	(Deprecated. Use the wraps directive to wrap a standalone SSH connector.) Username needed to connect with the SSH environment
	type	string

SlurmService

This complex type represents a submission to the Slurm queue manager.

properties
account	Charge resources used by this job to specified account
	type	string
acctgFreq	Define the job accounting and profiling sampling intervals in seconds
	type	string
array	Submit a job array, multiple jobs to be executed with identical parameters
	type	string
batch	Nodes can have features assigned to them by the Slurm administrator. Users can specify which of these features are required by their batch script using this options. The batch argument must be a subset of the job’s constraint argument
	type	string
bb	Burst buffer specification. The form of the specification is system dependent
	type	string
bbf	Path of file containing burst buffer specification. The form of the specification is system dependent
	type	string
begin	Submit the batch script to the Slurm controller immediately, like normal, but tell the controller to defer the allocation of the job until the specified time
	type	string
clusterConstraint	Specifies features that a federated cluster must have to have a sibling job submitted to it. Slurm will attempt to submit a sibling job to a cluster if it has at least one of the specified features. If the ! option is included, Slurm will attempt to submit a sibling job to a cluster that has none of the specified features
	type	string
clusters	Clusters to issue commands to. Multiple cluster names may be comma separated. The job will be submitted to the one cluster providing the earliest expected job initiation time. The default value is the current cluster
	type	string
constraint	Nodes can have features assigned to them by the Slurm administrator. Users can specify which of these features are required by their job using the constraint option
	type	string
container	Absolute path to OCI container bundle
	type	string
containerId	Unique name for OCI container
	type	string
contiguous	If set, then the allocated nodes must form a contiguous set
	type	boolean
coreSpec	Count of specialized cores per node reserved by the job for system operations and not used by the application. The application will not use these cores, but will be charged for their allocation
	type	integer
coresPerSocket	Restrict node selection to nodes with at least the specified number of cores per socket
	type	integer
cpuFreq	Request that job steps initiated by srun commands inside this sbatch script be run at some requested frequency if possible, on the CPUs selected for the step on the compute node(s)
	type	string
cpusPerGpu	Advise Slurm that ensuing job steps will require ncpus processors per allocated GPU. Not compatible with the cpusPerTask option
	type	integer
cpusPerTask	Advise the Slurm controller that ensuing job steps will require ncpus number of processors per task. Without this option, the controller will just try to allocate one processor per task
	type	integer
deadline	Remove the job if no ending is possible before this deadline. Default is no deadline
	type	string
delayBoot	Do not reboot nodes in order to satisfied this job’s feature specification if the job has been eligible to run for less than this time period. If the job has waited for less than the specified period, it will use only nodes which already have the specified features. The argument is in units of minutes
	type	integer
distribution	Specify alternate distribution methods for remote processes. For job allocation, this sets environment variables that will be used by subsequent srun requests and also affects which cores will be selected for job allocation
	type	string
exclude	Explicitly exclude certain nodes from the resources granted to the jo
	type	string
exclusive	The job allocation can not share nodes with other running jobs (or just other users with the user option or with the mcs option). If user/mcs are not specified (i.e. the job allocation can not share nodes with other running jobs), the job is allocated all CPUs and GRES on all nodes in the allocation, but is only allocated as much memory as it requested
	anyOf	type	boolean
		type	string
export	Identify which environment variables from the submission environment are propagated to the launched application
	type	string
exportFile	If a number between 3 and OPEN_MAX is specified as the argument to this option, a readable file descriptor will be assumed (STDIN and STDOUT are not supported as valid arguments). Otherwise a filename is assumed. Export environment variables defined in filename or read from fd to the job’s execution environment
	anyOf	type	integer
		type	string
extraNodeInfo	Restrict node selection to nodes with at least the specified number of sockets, cores per socket and/or threads per core
	type	string
file	Path to a file containing a Jinja2 template, describing how the StreamFlow command should be executed in the remote environment
	type	string
getUserEnv	This option will tell sbatch to retrieve the login environment variables for the user specified in the uid option. Be aware that any environment variables already set in sbatch’s environment will take precedence over any environment variables in the user’s login environment. The optional timeout value is in seconds (default: 8)
	anyOf	type	boolean
		type	string
gid	Submit the job with group’s group access permissions. The gid option may be the group name or the numerical group ID
	anyOf	type	integer
		type	string
gpuBind	Bind tasks to specific GPUs. By default every spawned task can access every GPU allocated to the step
	type	string
gpuFreq	Request that GPUs allocated to the job are configured with specific frequency values. This option can be used to independently configure the GPU and its memory frequencies
	type	string
gpus	Specify the total number of GPUs required for the job. An optional GPU type specification can be supplied (e.g., volta:3)
	type	string
gpusPerNode	Specify the number of GPUs required for the job on each node included in the job’s resource allocation. An optional GPU type specification can be supplied (e.g., volta:3)
	type	string
gpusPerSocket	Specify the number of GPUs required for the job on each socket included in the job’s resource allocation. An optional GPU type specification can be supplied (e.g., volta:3)
	type	string
gpusPerTask	Specify the number of GPUs required for the job on each task to be spawned in the job’s resource allocation. An optional GPU type specification can be supplied (e.g., volta:3)
	type	string
gres	Specifies a comma-delimited list of generic consumable resources
	type	string
gresFlags	Specify generic resource task binding options
	type	string
hint	Bind tasks according to application hints. This option cannot be used in conjunction with ntasksPerCore, threadsPerCore, or extraNodeInfo
	type	string
ignorePBS	Ignore all #PBS and #BSUB options specified in the batch script
	type	boolean
jobName	Specify a name for the job allocation. The specified name will appear along with the job id number when querying running jobs on the system
	type	string
licenses	Specification of licenses (or other resources available on all nodes of the cluster) which must be allocated to this job
	type	string
mailType	Notify user by email when certain event types occur
	type	string
mailUser	User to receive email notification of state changes as defined by mailType. The default value is the submitting user
	type	string
mcsLabel	Used only when the mcs/group plugin is enabled. This parameter is a group among the groups of the user
	type	string
mem	Specify the real memory required per node. Default units are megabytes
	type	string
memBind	Bind tasks to memory. Used only when the task/affinity plugin is enabled and the NUMA memory functions are available
	type	string
memPerCpu	Minimum memory required per usable allocated CPU. Default units are megabytes
	type	string
memPerGpu	Minimum memory required per allocated GPU. Default units are megabytes
	type	string
mincpus	Specify a minimum number of logical cpus/processors per node
	type	integer
network	Specify information pertaining to the switch or network
	type	string
nice	Run the job with an adjusted scheduling priority within Slurm. With no adjustment value the scheduling priority is decreased by 100. A negative nice value increases the priority, otherwise decreases it
	type	integer
noKill	Do not automatically terminate a job if one of the nodes it has been allocated fails. The user will assume the responsibilities for fault-tolerance should a node fail. The job allocation will not be revoked so the user may launch new job steps on the remaining nodes in their allocation
	type	boolean
noRequeue	Specifies that the batch job should never be requeued under any circumstances
	type	boolean
nodefile	Much like nodelist, but the list is contained in a file of name node file
	type	string
nodelist	Request a specific list of hosts. The job will contain all of these hosts and possibly additional hosts as needed to satisfy resource requirements
	type	string
nodes	Request that a minimum of minnodes nodes be allocated to this job. A maximum node count may also be specified with maxnodes. If only one number is specified, this is used as both the minimum and maximum node count. Node count can be also specified as size_string. The size_string specification identifies what nodes values should be used
	type	string
ntasks	This option advises the Slurm controller that job steps run within the allocation will launch a maximum of number tasks and to provide for sufficient resources
	type	integer
ntasksPerCore	Request the maximum ntasks be invoked on each core
	type	integer
ntasksPerGpu	Request that there are ntasks tasks invoked for every GPU
	type	integer
ntasksPerNode	Request that ntasks be invoked on each node
	type	integer
ntasksPerSocket	Request the maximum ntasks be invoked on each socket
	type	integer
openMode	Open the output and error files using append or truncate mode as specified
	type	string
overcommit	Overcommit resources. When applied to a job allocation (not including jobs requesting exclusive access to the nodes) the resources are allocated as if only one task per node is requested.
	type	boolean
oversubscribe	The job allocation can over-subscribe resources with other running jobs. The resources to be over-subscribed can be nodes, sockets, cores, and/or hyperthreads depending upon configuration
	type	boolean
partition	Request a specific partition for the resource allocation. If not specified, the default behavior is to allow the slurm controller to select the default partition as designated by the system administrator
	type	string
power	Comma separated list of power management plugin options
	type	string
prefer	Nodes can have features assigned to them by the Slurm administrator. Users can specify which of these features are desired but not required by their job using the prefer option. This option operates independently from constraint and will override whatever is set there if possible
	type	string
priority	Request a specific job priority. May be subject to configuration specific constraints
	type	string
profile	Enables detailed data collection by the acct_gather_profile plugin
	type	string
propagate	Allows users to specify which of the modifiable (soft) resource limits to propagate to the compute nodes and apply to their jobs
	type	string
qos	Request a quality of service for the job
	type	string
reboot	Force the allocated nodes to reboot before starting the job. This is only supported with some system configurations and will otherwise be silently ignored
	type	boolean
requeue	Specifies that the batch job should be eligible for requeuing. The job may be requeued explicitly by a system administrator, after node failure, or upon preemption by a higher priority job
	type	boolean
reservation	Allocate resources for the job from the named reservation. If the job can use more than one reservation, specify their names in a comma separate list and the one offering earliest initiation
	type	string
signal	When a job is within sig_time seconds of its end time, send it the signal sig_num. Due to the resolution of event handling by Slurm, the signal may be sent up to 60 seconds earlier than specified
	type	string
socketsPerNode	Restrict node selection to nodes with at least the specified number of sockets
	type	integer
spreadJob	Spread the job allocation over as many nodes as possible and attempt to evenly distribute tasks across the allocated nodes
	type	boolean
switches	When a tree topology is used, this defines the maximum count of leaf switches desired for the job allocation and optionally the maximum time to wait for that number of switches. If Slurm finds an allocation containing more switches than the count specified, the job remains pending until it either finds an allocation with desired switch count or the time limit expires
	type	string
threadSpec	Count of specialized threads per node reserved by the job for system operations and not used by the application. The application will not use these threads, but will be charged for their allocation
	type	integer
threadsPerCore	Restrict node selection to nodes with at least the specified number of threads per core. In task layout, use the specified maximum number of threads per core
	type	integer
time	Set a limit on the total run time of the job allocation. If a timeout value is defined directly in the workflow specification, it will override this value
	type	string
timeMin	Set a minimum time limit on the job allocation. If specified, the job may have its time limit lowered to a value no lower than timeMin if doing so permits the job to begin execution earlier than otherwise possible
	type	string
tmp	Specify a minimum amount of temporary disk space per node. Default units are megabytes
	type	integer
tresPerTask	Specifies a comma-delimited list of trackable resources required for the job on each task to be spawned in the job’s resource allocation
	type	string
uid	Attempt to submit and/or run a job as user instead of the invoking user id. user may be the user name or numerical user ID
	anyOf	type	integer
		type	string
useMinNodes	If a range of node counts is given, prefer the smaller count
	type	boolean
waitAllNodes	Controls when the execution of the command begins. By default the job will begin execution as soon as the allocation is made
	type	boolean
wckey	Specify wckey to be used with job
	type	string