User Tools

Site Tools


language_lab:slurm

Controlling slurm jobs

Settings for slurm job

Command Effect
#SBATCH --account=staff user is part of this group
#SBATCH --job-name=lvl_job name of MY job
#SBATCH --gpus-per-node=11 GPU on worker node (default)
#SBATCH --output=user.log output log file, if needed
#SBATCH --mem-per-cpu=3G 3 gb memory per process
#SBATCH --ntasks=4 you want 4 processes
#SBATCH --cpus-per-task=2 2 cores per process
#SBATCH --nodes=1-3 use at LEAST one node,up to 3
#SBATCH -- w

The default memory a batch jobs gets is 4gb, if user doesn't specify any setting in the batch file [DefMemPerNode=4096]

Partitions/queue config

The slurm cluster has 3 partitions (queues)

doTrain is (default), only for staff, allows jobs to run infinitely long time
All partition have DefaultTime=04:00:00
on allWork the MaxTime is 7 days and 1 hour
on beQuick the MaxTime is 1 days and 12 hour

Node config

NodeName=A … TmpDisk=1536

TmpDisk : Total size of temporary disk storage in TmpFS in megabytes
MinTmpDiskNode
default value for TmpDisk is 0, local scratch amount of TmpDisk space must be defined in the node config

language_lab/slurm.txt · Last modified: 2024/10/14 14:24 by 127.0.0.1