How to use VU DAS-5 system

DAS-5 or DAS (The Distributed ASCI Supercomputer 5 is VU’s local cluster installation. In this guide I will tell you how to use it for your project work.

Table of Contents

  1. Introduction
  2. User account
  3. Making reservations
  4. Software Modules
  5. Network Configuration
  6. Hardware and NUMA setup
  7. Acknowledgements

1. Introduction

Please read carefully https://www.cs.vu.nl/das5/accounts.shtml and https://www.cs.vu.nl/das5/usage.shtml.

Key things to remember and understand:

2. User account

If you do not have an account then please let me know and arrange for an account. Probably the older course-specific accounts that you have, could be expired.

2.1 Setup ssh

Add the system hostname and access detail into your ~/.ssh/config file. Replace VUNET_USERNAME and DAS5_USERNAME with the actual account name.

Host vu-ssh
	HostName ssh.data.vu.nl
	User VUNET_USERNAME

Host das5
	HostName fs0.das5.cs.vu.nl
	User DAS5_USERNAME
	ProxyJump vu-ssh

Then try to login into das ssh das5 and provide the right password credentials.

3. Making reservations

DAS-5 is a cluster of 68 machines with dual 8-core CPUs connected with InfiniBand FDR links.

3.1 Check the current status

In order to use the cluster you have to do a node(s) reservation. Here is a list of steps

3.2 Making reservations

To make your own reservation, for example, for 3 nodes preserve -np 3 -t 900 (you can see my reservation ready at the bottom of the queue, those are the three nodes I can use for my experiments)

3067606	xxxxxx         	04/01	17:58	04/02	07:58	R	1	node005
3067615	xxxxxx         	04/01	18:24	04/10	02:25	R	1	node070
3067624	xxxxxx         	04/01	19:00	04/01	19:30	R	1	node075
3067647	xxxxxx         	04/01	19:15	04/01	19:45	R	1	node072
3067652	atrivedi	04/01	19:22	04/01	19:38	R	3	node053 node054 node055

Now I can just ssh into those machines as ssh node053 and use it for the duration of my reservation.

During ‘office hours’ (8h to 20h) the default reservation time is 15 minutes (or 900 seconds). Outside those hours it can be as long as you want. After a timeout, you either will be kicked out, or should leave the machine.

3.3 Commands summary

For more details of these commands, you can read the man page (e.g., man preserve).

4. Software modules

There are different versions (or also known as modules) of software packages available on the DAS. You can see them by doing module avail. It will show you a long list of available software and their versions.

4.1 Checking the current software versions

You can see which modules are currently loaded for your session by

[atrivedi@fs0 atrivedi]$ module list -l
- Package -----------------------------+- Versions -+- Last mod. ------
Currently Loaded Modulefiles:
gcc/6.3.0                                            2017/10/02 21:23:21
slurm/17.02.2                                        2017/10/31 14:28:16
prun/default                                         2012/05/29 12:28:59
cmake/3.15.4                                         2019/10/28  7:41:41
[atrivedi@fs0 atrivedi]$ 

4.2 Changing software versions

Example of changing cmake and gcc versions

Cmake
[atrivedi@fs0 atrivedi]$ cmake --version 
cmake version 2.8.12.2
[atrivedi@fs0 atrivedi]$ module load cmake/  #press tab 
cmake/2.8.11    cmake/2.8.12.2  cmake/3.15.4
[atrivedi@fs0 atrivedi]$ module load cmake/3.15.4 
[atrivedi@fs0 atrivedi]$ cmake --version 
cmake version 3.15.4

CMake suite maintained and supported by Kitware (kitware.com/cmake).
[atrivedi@fs0 atrivedi]$ 
gcc
[atrivedi@fs0 atrivedi]$ gcc --version 
gcc (GCC) 6.3.0
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

[atrivedi@fs0 atrivedi]$ module load gcc/
gcc/4.8.3  gcc/4.9.2  gcc/4.9.3  gcc/6.4.0  gcc/8.4.0  gcc/9.3.0
[atrivedi@fs0 atrivedi]$ module load gcc/9.3.0 
[atrivedi@fs0 atrivedi]$ gcc --version 
gcc (GCC) 9.3.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

[atrivedi@fs0 atrivedi]$ 

4.3 Reverting back changes

How to revert back to the default configuration and remove a loaded module.

[atrivedi@fs0 atrivedi]$ module list
Currently Loaded Modulefiles:
  1) gcc/6.3.0       2) slurm/17.02.2   3) prun/default    4) cmake/3.15.4    5) gcc/9.3.0
[atrivedi@fs0 atrivedi]$ module rm cmake/3.15.4
[atrivedi@fs0 atrivedi]$ module list
Currently Loaded Modulefiles:
  1) gcc/6.3.0       2) slurm/17.02.2   3) prun/default    4) gcc/9.3.0
[atrivedi@fs0 atrivedi]$ cmake --version 
cmake version 2.8.12.2
[atrivedi@fs0 atrivedi]$ 

4.4 Making changes permanent

You can add all the modules you want to be default for you in your bashrc file at ~/.bashrc. For example, I have this as a default config in my bashrc file:

# User specific aliases and functions
module load gcc
module load slurm 
module load cmake/3.15.4
module load gcc/9.3.0
module load prun

All these modules are loaded in all machines when you log in there.

4.5 Commands summary

5. Network configuration

Please read https://www.cs.vu.nl/das5/network.shtml.

So then you get a node reservation of say node10, node56, and node24 to work with. You can assume that they have 1 Gbps IP of 10.141.0.10, 10.141.0.56, and 10.141.0.24. Their high-speed IP ranges are 10.149.0.10, 10.149.0.56, and 10.149.0.24.

Important: Generally, all your experiments should be running on the high-speed IP ranges. The 1Gbps network is used only for ssh access. (Unless you want the network to be 1 Gbps link). Please make sure you use the right IP.

To check the network traffic you can use something like watch -n 1 ifstat (or you can locally install any other tool you like).

6. Hardware and NUMA setup

The key command you should use is numactl (man numactl). It gives you control over which CPU cores and NUMA nodes the memory should be allocated for your program.

The current hardware topology on DAS looks like (two NUMA domains 0 and 1):

[atrivedi@node065 netperf]$ numactl -H 
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23
node 0 size: 32673 MB
node 0 free: 22964 MB
node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31
node 1 size: 32768 MB
node 1 free: 29015 MB
node distances:
node   0   1 
  0:  10  21 
  1:  21  10 
[atrivedi@node065 netperf]$ 

On which NUMA node is your high-speed network card?

[atrivedi@fs0 atrivedi]$ cat /sys/class/net/ib0/device/numa_node
1
[atrivedi@fs0 atrivedi]$ 

Numa node 1.

Example program run Whatever program you want to run, you can prepend the numactl configuration before it. For example, to run a program on NUMA domain 1 with memory from domain 0, we can do something like numactl -N 1 -m 0 [your_program_cmd_line]. For example:

To run a program on a specific hardware core, numctl -C (or you can also pass a list of comma separated core numbers as shown in the numactl -H command`).

As an extreme example, I have netperf program on my machines for network benchmarking. I started the netperf server, pinned on NUMA domain 0 as numactl -N 0 -m 0 ./src/netserver -D.

Now I start a client code running on the same machine (loopback traffic), but on two different domains and then the default. You see see the difference below (almost double performance):

[atrivedi@node065 netperf]$ ./src/netperf -H localhost 
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain () port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    10.00    23217.04

[atrivedi@node065 netperf]$ numactl -N 1 -m 1 ./src/netperf -H localhost 
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain () port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    10.00    23502.12

[atrivedi@node065 netperf]$ numactl -N 0 -m 0 ./src/netperf -H localhost 
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost.localdomain () port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 87380  16384  16384    10.00    41351.91

How big of a problem is this? Depends what you are trying to do. When in doubt, always measure.

7. Acknowledgements

Jesse Donkervliet’s write up at https://github.com/atlarge-research/opencraft-tutorial and Matthijs Jansen.


Jekyll theme inspired by researcher