I have a 12-core laptop (6 physical cores with hyperthreading) running Slurm for local job scheduling. When I submit job arrays requesting all 12 cores to be used simultaneously, Slurm consistently only runs 6 jobs concurrently, despite the system having 12 logical cores and sufficient memory.
I've tried multiple approaches:
Explicitly setting lower memory requirements:
#SBATCH --mem-per-cpu=100M
Using the --oversubscribe flag:
#SBATCH --oversubscribe
Explicitly setting CPU parameters:
#SBATCH --cpus-per-task=1
#SBATCH --threads-per-core=2
Checked that my partition allows oversubscription:
PartitionName=debug Nodes=thoma-Lenovo-Legion-5-15IMH05H Default=YES MaxTime=INFINITE State=UP OverSubscribe=YES
None of these approaches allowed more than 6 jobs to run concurrently.
My system is a Lenovo Legion 5 laptop running Ubuntu, with the following Slurm configuration:
SelectType = select/cons_tres
SelectTypeParameters = CR_CORE_MEMORY
MaxTasksPerNode = 512
DefMemPerNode = UNLIMITED
CPUAlloc=0 CPUEfctv=12 CPUTot=12
RealMemory=7846 AllocMem=0 FreeMem=123 Sockets=1 Boards=1
Here's my slurm program -
#!/bin/bash
#SBATCH --job-name=core_test
#SBATCH --array=1-12%12
#SBATCH --output=core_test_%A_%a.log
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=500M #I don't like this constraint
#SBATCH --time=00:05:00
#SBATCH --hint=nomultithread # This tells Slurm to avoid using hyperthreading
echo "Starting task $SLURM_ARRAY_TASK_ID at $(date)"
python test_parallel.py
echo "Task $SLURM_ARRAY_TASK_ID completed at $(date)"