- This repository contains information for the gromacs tests as a part of the aurum3 tender that are required to be passed for the acceptance of the cluster
## Disclaimer
@@ -13,18 +13,18 @@
- A job is an own parallel execution (i.e., using mpi/gpu) to be executed in a unique set of nodes.
## Provided files
- Intructions for the tests (Intructions_gmx2024_tests.docx)
- Instructions for the tests (Instructions_gmx2024_tests.docx)
- Gromacs input files to perform the tests (*.tpr files)
## General considerations:
- All test must successfully pass without exception.
- All tests must successfully pass without exception.
- All the tasks (i.e., group of jobs) within a test must be submitted to the job scheduler consecutively.
- When executing a test, no time gaps between the submission of tasks are acceptable beyond those produced by the job scheduler when functioning normally.
- All the jobs in each task must run simultaneously and use a unique set of nodes.
- Small differences in the starting execution times of the jobs within a task are acceptable when produced by the job scheduler when functioning normally.
- Each of the jobs within a task must exceed the minimal performance prerequisite established for the task.
- It is the responsibility of the provider to adjust the minimal instructions provided in the **sample script** for each test to run in the cluster in a way that fulfills the target performance.
- Job lanching scrips (run-slurm-gmx2024.sh and run-gpu-slurm-gmx2024.sh) called in the sample scripts have been validated for [Slurm](https://slurm.schedmd.com/) and use Gromacs installed using [Spack](https://spack.readthedocs.io/). It is the responsibility of the provider to adjust these scripts to the target cluster.
- Job launching scripts (run-slurm-gmx2024.sh and run-gpu-slurm-gmx2024.sh) called in the sample scripts have been validated for [Slurm](https://slurm.schedmd.com/) and use Gromacs installed using [Spack](https://spack.readthedocs.io/). It is the responsibility of the provider to adjust these scripts to the target cluster.
- The results of the tests if not performed by the client must be provided for inspection to ensure their validity.
- The tests have to be performed using Gromacs 2024.3 or newer - The program has a GPLv3 license and can be downloaded for free from the following [page](http://www.Gromacs.org/).
@@ -41,10 +41,10 @@
- $N = N_{cpu}+N_{mem}+N_{gpu}+N_{biggpu}$= Total number of nodes in the cluster
- $N^{test}_{task,job}$ = Number of nodes to be used in a job for a given task belonging to a test
- Number of jobs required to run simultaneously within a task
- $J_{cpu}^{max} = \text{floor interger part of } (N_{cpu}/N^{test}_{task,job})$ (Maximum number of simulatneous jobs in a task using computing nodes)
- $J_{mem}^{max} = \text{floor interger part of } (N_{mem}/N^{test}_{task,job})$ (Maximum number of simulatneous jobs in a task using big memory nodes
- $J_{gpu}^{max} = \text{floor interger part of } (N_{gpu}/N^{test}_{task,job})$ (Maximum number of simulatneous jobs in a task using gpu nodes)
- $J_{biggpu}^{max} = \text{floor interger part of } (N_{gpu}/N^{test}_{task,job})$ (Maximum number of simulatneous jobs in a task using gpu nodes)
- $J_{cpu}^{max} = \text{floor integer part of } (N_{cpu}/N^{test}_{task,job})$ (Maximum number of simultaneous jobs in a task using computing nodes)
- $J_{mem}^{max} = \text{floor integer part of } (N_{mem}/N^{test}_{task,job})$ (Maximum number of simultaneous jobs in a task using big memory nodes
- $J_{gpu}^{max} = \text{floor integer part of } (N_{gpu}/N^{test}_{task,job})$ (Maximum number of simultaneous jobs in a task using gpu nodes)
- $J_{biggpu}^{max} = \text{floor integer part of } (N_{gpu}/N^{test}_{task,job})$ (Maximum number of simultaneous jobs in a task using gpu nodes)
## Template submission in slurm of a gromacs job for 1 hour
-**Short description of the system**: Molecular dynamics simulation of membrane interacting with a periferal protein system in Gromacs.
-**Short description of the system**: Molecular dynamics simulation of membrane interacting with a peripheral protein system in Gromacs.
-**tpr file:** sys1_150k_gmx2024.tpr
-**Number atoms:** 149261
-**Short description of the test**: Full (36 cores max) ocupation of the computational nodes using 1 or more nodes per job. There might be few unallocaded nodes if $N_{cpu}$ is not a multiple of $N^{\text{scaling\_test\_cpu}}_{task,job}$.
-**Short description of the test**: Full (36 cores max) occupation of the computational nodes using 1 or more nodes per job. There might be few unallocated nodes if $N_{cpu}$ is not a multiple of $N^{\text{scaling\_test\_cpu}}_{task,job}$.
-**Number of tasks:** 11 x 5
-**Storage for tasks:** This test must try all 5 different NFS shares in storage1, storage2, scratch1, scratch2, cryo2.
-**Execution time per task:** 1 hour
-**Number of computational nodes per job in each task:** $N^{\text{scaling\_test\_cpu}}_{\text{task,job}}= [1, 2, 4, 6, 8, 10, 12, 14, 16, 18 and 20] Nodes.
-**Number of computational nodes per job in each task:** $N^{\text{scaling\_test\_cpu}}_{\text{task,job}}$= [1, 2, 4, 6, 8, 10, 12, 14, 16, 18 and 20] Nodes.
-**Number of simultaneous jobs in each task:** $J_{cpu}^{max}$
-**Special test conditions:**
@@ -98,15 +98,15 @@ The results for the same number of nodes with the number of cores per node restr
## 2. Tests scaling in the mem nodes
-**Test name:**: scaling_test_mem
-**Nodes involved**: big memory cpu computational nodes
-**Short description of the system**: Molecular dynamics simulation of membrane interacting with a periferal protein system in Gromacs.
-**Short description of the system**: Molecular dynamics simulation of membrane interacting with a peripheral protein system in Gromacs.
-**tpr file:** sys1_150k_gmx2024.tpr
-**Number atoms:** 149261
-**Short description of the test**: Full (36 cores max) ocupation of the computational nodes using 1 or more nodes per job. There might be few unallocaded nodes if $N_{mem}$ is not a multiple of $N^{\text{scaling\_test\_mem}}_{task,job}$.
-**Short description of the test**: Full (36 cores max) occupation of the computational nodes using 1 or more nodes per job. There might be few unallocated nodes if $N_{mem}$ is not a multiple of $N^{\text{scaling\_test\_mem}}_{task,job}$.
-**Number of tasks:** 2 x 5
-**Storage for tasks:** This test must try all 5 different NFS shares in storage1, storage2, scratch1, scratch2, cryo2.
-**Execution time per task:** 1 hour
-**Number of computational nodes per job in each task:** $N^{\text{scaling\_test\_men}}_{\text{task,job}}= [1, 2] Nodes.
-**Number of simultaneous jobs in each task:** $J_{mem}^{max}$
-**Special test conditions:**
- All tasks in the test need to be run sequentially without interruption to ensure that some jobs eventually use nodes placed in different sections of the cluster.
@@ -126,14 +126,14 @@ The results for the same number of nodes with the number of cores per node restr
## 3. Tests scaling in the gpu nodes
-**Test name:**: scaling_test_gpu
-**Nodes involved**: gpu computational nodes
-**Short description of the system**: Molecular dynamics simulation of membrane interacting with a periferal protein system in Gromacs.
-**Short description of the system**: Molecular dynamics simulation of membrane interacting with a peripheral protein system in Gromacs.
-**tpr file:** sys1_150k_gmx2024.tpr
-**Number atoms:** 149261
-**Short description of the test**: Full (36 cores max) ocupation of the computational nodes using 1 or more nodes per job. There might be few unallocaded nodes if $N_{gpu}$ is not a multiple of $N^{\text{scaling\_test\_gpu}}_{task,job}$.
-**Short description of the test**: Full (36 cores max) occupation of the computational nodes using 1 or more nodes per job. There might be few unallocated nodes if $N_{gpu}$ is not a multiple of $N^{\text{scaling\_test\_gpu}}_{task,job}$.
-**Number of tasks:** 2 x 5
-**Storage for tasks:** This test must try all 5 different NFS shares in storage1, storage2, scratch1, scratch2, cryo2.
-**Execution time per task:** 1 hour
-**Number of computational nodes per job in each task:** $N^{\text{scaling\_test\_gpu}}_{\text{task,job}}= [1, 2, 4, 6, 8, 10, 12, 14, 16, 18 and 20] Nodes.
-**Number of computational nodes per job in each task:** $N^{\text{scaling\_test\_gpu}}_{\text{task,job}}$= [1, 2, 4, 6, 8, 10, 12, 14, 16, 18 and 20] Nodes.
-**Number of cores per node:** Only 36 cores can be used for each node in this test
-**Number of simultaneous jobs in each task:** $J_{gpu}^{max}$
@@ -157,10 +157,10 @@ The results for the same number of nodes with the number of cores per node restr
## 4. Tests scaling in the biggpu nodes (CPU only)
-**Test name:**: scaling_test_biggpu
-**Nodes involved**: gpu computational nodes
-**Short description of the system**: Molecular dynamics simulation of membrane interacting with a periferal protein system in Gromacs.
-**Short description of the system**: Molecular dynamics simulation of membrane interacting with a peripheral protein system in Gromacs.
-**tpr file:** sys1_150k_gmx2024.tpr
-**Number atoms:** 149261
-**Short description of the test**: Full (36 cores max) ocupation of the computational nodes using 1 or more nodes per job. There might be few unallocaded nodes if $N_{biggpu}$ is not a multiple of $N^{\text{scaling\_test\_gpu}}_{task,job}$.
-**Short description of the test**: Full (36 cores max) occupation of the computational nodes using 1 or more nodes per job. There might be few unallocated nodes if $N_{biggpu}$ is not a multiple of $N^{\text{scaling\_test\_gpu}}_{task,job}$.
-**Number of tasks:** 5 x 5
-**Storage for tasks:** This test must try all 5 different NFS shares in storage1, storage2, scratch1, scratch2, cryo2.
-**Execution time per task:** 1 hour
@@ -185,9 +185,9 @@ The results for the same number of nodes with the number of cores per node restr
## 5. Test endurance cluster - not internode calculations
-**Test name:**: endurance_test_all
-**Short description of the system**: Molecular dynamics simulation of membrane interacting with a periferal protein system in Gromacs.
-**Short description of the system**: Molecular dynamics simulation of membrane interacting with a peripheral protein system in Gromacs.
-**Number atoms:** 172742
-**Short description of the test:** Full occupation of all nodes (cpu, mem, gpu, and biggpu) using 1 job per node in cpu and mem nodes, and n jobs per nodes in the gpu and biggpunodes. It is a stress test where all the cluster should work under full load for at least 23h.
-**Short description of the test:** Full occupation of all nodes (cpu, mem, gpu, and biggpu) using 1 job per node in cpu and mem nodes, and n jobs per nodes in the gpu and biggpunodes. It is a stress test where all the clusters should work under full load for at least 23h.
-**Number of tasks:** 1
-**Number of computational nodes per job in each task:** 1
-**Number of simultaneous jobs in each task:** $J_{cpu}^{max} + J_{mem}^{max} + J_{gpu}^{max} + J_{biggpu}^{max}$
@@ -195,9 +195,9 @@ The results for the same number of nodes with the number of cores per node restr
-**Execution time per task:** 23 hours
-**Total duration test:** 23 hours
-**Special test conditions:**
- All jobs must run simulatneously for the fixed period of 23h without interruptions.
- All jobs must run simultaneously for the fixed period of 23h without interruptions.
- All cores must be used, hyperthreading or similar technology must be activated when available.
- In gpu containing nodes there should be 1 job running for each gpu (n_jobs). The cores in the node should be devided equally devided between the n_jobs.
- In gpu containing nodes there should be 1 job running for each gpu (n_jobs). The cores in the node should be divided equally divided between the n_jobs.