Skip to main content

Partitions & SLURM

To view information about the available nodes and partitions, use the following commands:

sinfo

For more detailed information about a specific partition:

scontrol show partition <partition-name>

The login and control nodes of Elja host two compute clusters: HPC-Elja and HTC-Mimir. Partitions and groups are used to separate the two.

HPC-Elja: Available Partitions / Compute Nodes

CountNameCores/NodeMemory/Node (GB)Features
2848cpu_192mem48 (2x24)192 (188)Intel Gold 6248R
5564cpu_256mem64 (2x32)256 (252)Intel Platinum 8358
4128cpu_256mem128 (2x64)256 (252)AMD EPYC 7713
3gpu-1xA10064 (2x32)192 (188)Nvidia A100 Tesla GPU
5gpu-2xA10064 (2x32)192 (188)Dual Nvidia A100 Tesla GPU
1gpu-8xA100128 (2x64)1000 (996)8 Nvidia A100 Tesla GPUs

Core to true (memory) ratio equates to 3.9GB per core.

HPC-Elja: Job Limits

Each partition has a maximum seven (7) day time limit. Additionally, the queues any_cpu and long are provided:

  • any_cpu: all CPU nodes, one (1) day time limit
  • long: ten 48cpu and ten 64cpu nodes, fourteen (14) day time limit

HTC-Mimir: Available Partitions / Compute Nodes

CountNameCores/NodeMemory/Node (GB)Features
9mimir64 (2x32)256 (252)
1mimir-himem64 (2x32)2048 (2044)

Core to true (memory) ratio equates to 3.9GB per core on the mimir partition, and 31GB on the mimir-himem partition.

HTC-Mimir: Job Limits

Either partition has a fourteen (14) day time limit.

HPC-Stefnir: Available Partitions / Compute Nodes

CountNameCores/NodeMemory/Node (GB)Features
44scompute64256 (252)Intel(R) Xeon(R) Platinum 8358

Core to true (memory) ratio equates to 3.9GB per core.

HPC-Stefnir: Job Limits

Each partition has a maximum two (2) day time limit.

SLURM Configuration

SLURM is configured such that 3.94GB of memory is allocated per core.

Available Memory

On each node, 2-4 GiB RAM are reserved for the operating system images (hence the true value is in parentheses).