Simplified and improved (9aae8078) · Commits · tenders / aurum3

README.md

+54 −62

Original line number	Diff line number	Diff line
		@@ -5,7 +5,7 @@
		- This repository contains information for the gromacs tests as a part of the aurum3 tender that are required to be passed for the acceptance of the cluster

		## Disclaimer
		- The following test might require some tuning by the provider to make them work on the purchased cluster. Typos or incorrect syntax in the templates and scripts provided do not exempt the provider from passing the tests.
		- The following test might require some tuning by the provider to make them work on the purchased cluster. Typos or incorrect syntax in the provided coomangs do not exempt the provider from passing the tests.

		## General definitions
		- A test can contain several tasks, and each task may contain several jobs.
		@@ -17,14 +17,13 @@
		- Gromacs input files to perform the tests (*.tpr files)

		## General considerations:
		- All tests must successfully pass without exception.
		- All the tasks (i.e., group of jobs) within a test must be submitted to the job scheduler consecutively.
		- When executing a test, no time gaps between the submission of tasks are acceptable beyond those produced by the job scheduler when functioning normally.
		- All the jobs in each task must run simultaneously and use a unique set of nodes.
		- Small differences in the starting execution times of the jobs within a task are acceptable when produced by the job scheduler when functioning normally.
		- Each of the jobs within a task must exceed the minimal performance prerequisite established for the task.
		- It is the responsibility of the provider to adjust the minimal instructions provided in the sample script for each test to run in the cluster in a way that fulfills the target performance.
		- Job launching scripts (run-slurm-gmx2024.sh and run-gpu-slurm-gmx2024.sh) called in the sample scripts have been validated for [Slurm](https://slurm.schedmd.com/) and use Gromacs installed using [Spack](https://spack.readthedocs.io/). It is the responsibility of the provider to adjust these scripts to the target cluster.
		- All jobs in all tests must successfully excede the minimal performance prerequisite (nd/day) established for the task.
		- A test may contain substests
		- Subtests may contain several tasks
		- All the jobs in a task must run simultaneously and use a the maximum possible number of nodes within the set of nodes specified.
		- When executing a task, no time gaps between the submission of jobs are acceptable beyond those produced by the job scheduler when functioning normally.
		- If the performance results provided by the client are challenged by the supplier, it is the responsibility of the provider to adjust the minimal instructions provided for each test to run in the cluster in a way that fulfills the target performance.
		- A template to run the tests is provided using [Slurm](https://slurm.schedmd.com/) and using Gromacs 2024.3 installed using [Spack](https://spack.readthedocs.io/).
		- The results of the tests if not performed by the client must be provided for inspection to ensure their validity.
		- The tests have to be performed using Gromacs 2024.3 or newer - The program has a GPLv3 license and can be downloaded for free from the following [page](http://www.Gromacs.org/).

		@@ -48,7 +47,6 @@ spack load gromacs@2024.3
		NAME="sys1_150k_gmx2024"
		# Start gromacs Run
		srun gmx_mpi mdrun -deffnm ${NAME} -nsteps -1 -v -maxh 1

		```

		## General parameters of all tests (1 and 2)
		@@ -57,50 +55,43 @@ srun gmx_mpi mdrun -deffnm ${NAME} -nsteps -1 -v -maxh 1
		- Number atoms: 149261


		## 1. Test of scaling over network

		Short description of the test: Full (36 cores max) occupation of the computational nodes using 1 or more nodes per job.

		Tests 1.1 - 1.4 test performance scaling over multple nodes over the low-latency network. The tests 1.1 - 1.4 are identical, but target different sets of nodes (1.1 = cpu nodes, 1.2 = bigmem nodes, 1.3 = gpu nodes, 1.4 gpu-mem nodes). The tests 1.1 - 1.4 will be performed independently.
		## 1. Scaling over network test

		Each test consists of several tasks, defined by the "scaling factor", which is the number of computational nodes involved in each gromacs job.
		- Short description of the test: Performance testing over the low latency network when using 1 or more nodes per job. This test contain 4 subtests which only differ by the taget nodes (1.1 = cpu nodes, 1.2 = hugemem nodes, 1.3 = gpu nodes, 1.4 = gpu-mem nodes). The tests 1.1 to 1.4 can be performed independently.

		The number of computational nodes per job in each task will be set to [1, 2, 4, 6, 8, 10, 12, 14, 16, 18 and 20] (limited to the delivered number of nodes in tests 1.2 and 1.4.). Multiple jobs per task will be launched simultaneously such that the partition (cpu / bigmem / gpu / big-gpu) is filled as much as possible.
		- Tasks: Each subtest consists of 6 tasks defined by the number of computational nodes per job in the following list [1, 2, 4, 10, 16, and 20]. Each task must use as much as possible the target nodes. Additionally each task will be repeated using each ot the 5 new storage locations provided via NFS in storage1, storage2, scratch1, scratch2, cryo2. Only the lower number needs to be provided.

		- Aditional notes
		- Only 36 cores can be used for each node in this test.
		- No GPUs are allowed (if present in the node)
		- All jobs in a given task need to be run sequentially to ensure concurrency and that jobs use all/most nodes placed in different sections of the cluster.
		- Execution time for each test is 1 hour per job in a task

		Minimal required performance:
		Each job (in every task in all tests 1.1-1.4) must report at least the following performance
		\| Cores (Nodes) \| 36 (1) \| 72 (2) \| 144 (4) \| 216 (6) \| 288 (8) \| 360 (10) \| 432 (12) \| 504 (14) \| 576 (16)\| 648 (18)\| 720 (20)\|
		\| ------ \| --- \| --- \| --- \| --- \| --- \| --- \| --- \| --- \| --- \| --- \| --- \|
		\| ns/day \| 10.4 \| 20 \| 38.4 \| 56.8 \| 72 \| 83.2 \| 104 \| 121.6 \| 133.6 \| 140 \| 150.4 \|

		Note that this corresponds to the 80% of the performance recorded in our older aurum cluster. The following results were obtained using 2 x Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz per node.
		- Minimal required performance:
		Each job must report at least the following performance
		\| Cores (Nodes) \| 36 (1) \| 72 (2) \| 144 (4) \| 288 (8) \|360 (10) \| 576 (16)\| 720 (20)\|
		\| ------ \| --- \| --- \| --- \| --- \| --- \| --- \| --- \|
		\| ns/day \| 10 \| 20 \| 38 \| 72 \|83 \| 133 \| 150 \|

		\| Cores (Nodes) \| 36 (1) \| 72 (2) \| 144 (4) \| 216 (6) \| 288 (8) \| 360 (10) \| 432 (12) \| 504 (14) \| 576 (16)\| 648 (18)\| 720 (20)\|
		\| ------ \| --- \| --- \| --- \| --- \| --- \| --- \| --- \| --- \| --- \| --- \| --- \|
		\| ns/day \| 13 \| 25 \| 48 \| 71 \| 90 \| 104 \| 130 \| 152 \| 167 \| 175 \| 188 \|
		Note that this corresponds to the ~80% of the performance recorded in our older aurum cluster that contains 2 x Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz per node.

		\| Cores (Nodes) \| 36 (1) \| 72 (2) \| 144 (4) \| 288 (8) \| 360 (10) \| 576 (16)\| 720 (20)\|
		\| ------ \| --- \| --- \| --- \| --- \|--- \| --- \| --- \|
		\| ns/day \| 13 \| 25 \| 48 \| 90 \| 104 \| 152 \| 188 \|

		Aditional notes
		- Only 36 cores can be used for each node in this test.
		- No GPU is allowed (if present in the node)
		- All tasks in a given test need to be run sequentially without interruption to ensure that some jobs eventually use nodes placed in different sections of the cluster.
		- For each of task, the target part of the cluster (cpu / bigmem / gpu / big-gpu) must be ocuppied as much as possible by launching additional copies of the job. There might be few unallocated nodes if the number of computational nodes is not a multiple of number of computational nodes per job in each task.
		- The tests must try all 5 different NFS shares in storage1, storage2, scratch1, scratch2, cryo2.
		- Execution time for each test is 1 hour per task
		- Total duration for each test is 1 hour per task x number of tasks (2-11 depending on the number of delivered nodes in each group) x 5 storages
		- All jobs need to exceed the minimal required performance. The minimal reported values (slowest jobs) will be recorded below.
		- Reporting:
		All jobs need to exceed the minimal required performance. The minimal reported values (slowest jobs) will be recorded as in the empty tables below.

		### 1.1 Tests scaling in the cpu nodes
		- Test name: scaling_test_cpu
		- Test name: 1.1 scaling_test_cpu
		- Lowest recorded results: (to be filled in the protocol)

		\| Cores (Nodes) \| 36 (1) \| 72 (2) \| 144 (4) \| 216 (6) \| 288 (8) \| 360 (10) \| 432 (12) \| 504 (14) \| 576 (16)\| 648 (18)\| 720 (20)\|
		\| ------ \| --- \| --- \| --- \| --- \| --- \| --- \| --- \| --- \| --- \| --- \| --- \|
		\| (ns/day) \| \| \| \| \| \| \| \| \| \| \| \|
		\| Cores (Nodes) \| 36 (1) \| 72 (2) \| 144 (4) \| 360 (10) \| 576 (16)\| 720 (20)\|
		\| ------ \| --- \| --- \| --- \| --- \| --- \| --- \|
		\| (ns/day) \| \| \| \| \| \| \|

		### 1.2 Tests scaling in the bigmem nodes
		- Test name: scaling_test_mem
		### 1.2 Tests scaling in the hugemem nodes
		- Test name: scaling_test_hugemem
		- Lowest recorded results: (to be filled in the protocol)

		\| Cores (Nodes) \| 36 (1) \| 72 (2) \|
		@@ -111,36 +102,31 @@ Note that this corresponds to the 80% of the performance recorded in our old
		- Test name: scaling_test_gpu
		- Lowest recorded results: (to be filled in the protocol)

		\| Cores (Nodes) \| 36 (1) \| 72 (2) \| 144 (4) \| 216 (6) \| 288 (8) \| 360 (10) \| 432 (12) \| 504 (14) \| 576 (16)\| 648 (18)\| 720 (20)\|
		\| ------ \| --- \| --- \| --- \| --- \| --- \| --- \| --- \| --- \| --- \| --- \| --- \|
		\| (ns/day) \| \| \| \| \| \| \| \| \| \| \| \|



		\| Cores (Nodes) \| 36 (1) \| 72 (2) \| 144 (4) \| 360 (10) \| 576 (16)\| 720 (20)\|
		\| ------ \| --- \| --- \| --- \| --- \| --- \| --- \|
		\| (ns/day) \| \| \| \| \| \| \|

		### 1.4 Tests scaling in the gpu-mem nodes
		- Test name: scaling_test_biggpu
		- Lowest recorded results: (to be filled in the protocol)

		\| Cores (Nodes) \| 36 (1) \| 72 (2) \| 144 (4) \| 216 (6) \| 288 (8) \|
		\| ------ \| --- \| --- \| --- \| --- \| --- \|
		\| (ns/day) \| \| \| \| \| \|

		\| Cores (Nodes) \| 36 (1) \| 72 (2) \| 144 (4) \| 288 (8) \|
		\| ------ \| --- \| --- \| --- \| --- \|
		\| (ns/day) \| \| \| \| \|


		## 2. Test cluster endurance - not internode calculations
		- Test name: endurance_test_all
		- Short description of the test: Full occupation of all nodes (cpu, mem, gpu, and biggpu) using 1 job per node in cpu and mem nodes, and n jobs per nodes in the gpu and biggpunodes. It is a stress test where all the clusters should work under full load for at least 23h.
		- Short description of the test: It is a stress/endurance test where all the clusters should work under full load for at least 23 hours. Full occupation of all nodes (cpu, hugemem, gpu, and gpu-mem) using 1 job per node in cpu and hugemem nodes, and n jobs per nodes in the gpu and gpu-mem node (where n= number of gpus) is required at all times.
		- Number of tasks: 1
		- Number of computational nodes per job in each task: 1
		- Number of simultaneous jobs in each task: = number of delivered nodes
		- Storage for tasks: local scratches
		- Execution time per task: 23 hours
		- Total duration test: 23 hours
		- Special test conditions:
		- All jobs must run simultaneously for the fixed period of 23h without interruptions.
		- All cores must be used, hyperthreading or similar technology must be activated when available.
		- In gpu containing nodes there should be 1 job running for each gpu (n_jobs). The cores in the node should be divided equally divided between the n_jobs.
		- In gpu containing nodes there should be 1 job running for each gpu. The cores in the node should be equally divided between the gpus.
		- If any job fails the test must be restarted
		- All jobs must write to local scraches
		- Minimal performance for each job:
		- jobs in cpu and mem nodes: 50 ns/day
		- jobs in gpu nodes: 175 ns/day
		@@ -148,3 +134,9 @@ Note that this corresponds to the 80% of the performance recorded in our old
		The following reference performance values were obtained, making the requested values realistic:
		- 39 ns/day were obtained using 48/96 cores/threads (AMD EPYC 9454 48-Core Processor)
		- 218 ns/day were obtained using 12 cores (AMD EPYC 9454 48-Core Processor) and 1x L40s graphic cards
		- Reporting:
		All jobs need to exceed the minimal required performance. The minimal performance gathered (slowest jobs) in the a) cpu/huge-mem nodes and in b) the gpu and gpu-mem nodes will be recorded as in the empty tables below.

		\| Cores (Nodes) \| cpu/hugemem \| gpu/gpu-mem \|
		\| ------ \| --- \| --- \|
		\| minumum in job (ns/day) \| \| \|
		No newline at end of file