language_lab:slurm
Controlling slurm jobs
Settings for slurm job
Command | Effect |
---|---|
#SBATCH --account=staff | user is part of this group |
#SBATCH --job-name=lvl_job | name of MY job |
#SBATCH --gpus-per-node=1 | 1 GPU on worker node (default) |
#SBATCH --output=user.log | output log file, if needed |
#SBATCH --mem-per-cpu=3G | 3 gb memory per process |
#SBATCH --ntasks=4 | you want 4 processes |
#SBATCH --cpus-per-task=2 | 2 cores per process |
#SBATCH --nodes=1-3 | use at LEAST one node,up to 3 |
#SBATCH -- | w |
The default memory a batch jobs gets is 4gb, if user doesn't specify any setting in the batch file [DefMemPerNode=4096]
Partitions/queue config
The slurm cluster has 3 partitions (queues)
doTrain is (default), only for staff, allows jobs to run infinitely long time
All partition have DefaultTime=04:00:00
on allWork the MaxTime is 7 days and 1 hour
on beQuick the MaxTime is 1 days and 12 hour
Node config
NodeName=A … TmpDisk=1536
TmpDisk : Total size of temporary disk storage in TmpFS in megabytes
MinTmpDiskNode
default value for TmpDisk is 0, local scratch amount of TmpDisk space must be defined in the node config
language_lab/slurm.txt · Last modified: 2024/10/14 14:24 by 127.0.0.1