Update file README.md (f986a47b) · Commits · tenders / aurum3

README.md

+9 −8

Original line number	Diff line number	Diff line
		@@ -88,13 +88,14 @@ The number of computational nodes per job in each task will be set to [1, 2, 4,
		For each job, the results for the same number of nodes with the number of cores per node restricted to 36 should be at least 80% of the value obtained with our older aurum cluster, in the table above. For example, a job involving 10 nodes (360 cores) must report at least 0.8*104 = 83.2 ns/day.

		Aditional notes
		- There might be few unallocated nodes if the number of computational nodes is not a multiple of number of computational nodes per job in each task.
		- The tests must try all 5 different NFS shares in storage1, storage2, scratch1, scratch2, cryo2.
		- All tasks in a given test need to be run sequentially without interruption to ensure that some jobs eventually use nodes placed in different sections of the cluster.
		- Only 36 cores can be used for each node in this test.
		- No GPU is allowed (if present in the node)
		- All tasks in a given test need to be run sequentially without interruption to ensure that some jobs eventually use nodes placed in different sections of the cluster.
		- There might be few unallocated nodes if the number of computational nodes is not a multiple of number of computational nodes per job in each task.
		- The tests must try all 5 different NFS shares in storage1, storage2, scratch1, scratch2, cryo2.

		- Execution time for each test is 1 hour per task
		- Total duration for each test is 5x
		- Total duration for each test is 1 hour per task x number of tasks (2-11 depending on the number of delivered nodes in each group) x 5 storages

		### 1.1 Tests scaling in the cpu nodes
		- Test name: scaling_test_cpu
		@@ -104,7 +105,7 @@ For each job, the results for the same number of nodes with the number of cores


		### 1.2 Tests scaling in the bigmem nodes
		- Test name:: scaling_test_mem
		- Test name: scaling_test_mem
		- Number of tasks: 2 x 5
		- Total duration test: $\sim 1 \text{ hour per task} \times N^{\text{scaling\_test\_men}}_{\text{task,job}} \times 5 \text{ storages}$
		- Number of simultaneous jobs in each task: $J_{mem}^{max}$
		@@ -112,7 +113,7 @@ For each job, the results for the same number of nodes with the number of cores


		### 1.3 Tests scaling in the gpu nodes
		- Test name:: scaling_test_gpu
		- Test name: scaling_test_gpu
		- Number of tasks: 2 x 5
		- Number of cores per node: Only 36 cores can be used for each node in this test
		- Total duration test: $\sim 1 \text{ hour per task} \times N^{\text{scaling\_test\_gpu}}_{\text{task,job}} \times 5 \text{ storages}$
		@@ -122,7 +123,7 @@ For each job, the results for the same number of nodes with the number of cores


		### 1.4 Tests scaling in the gpu-mem nodes
		- Test name:: scaling_test_biggpu
		- Test name: scaling_test_biggpu
		- Number of tasks: 5 x 5
		- Total duration test: $\sim 1 \text{ hour per task} \times N^{\text{scaling\_test\_biggpu}}_{\text{task,job}} \times 5 \text{ storages}$
		- Number of simultaneous jobs in each task: $J_{biggpu}^{max}$
		@@ -130,7 +131,7 @@ For each job, the results for the same number of nodes with the number of cores


		## 2. Test cluster endurance - not internode calculations
		- Test name:: endurance_test_all
		- Test name: endurance_test_all
		- Short description of the test: Full occupation of all nodes (cpu, mem, gpu, and biggpu) using 1 job per node in cpu and mem nodes, and n jobs per nodes in the gpu and biggpunodes. It is a stress test where all the clusters should work under full load for at least 23h.
		- Number of tasks: 1
		- Number of computational nodes per job in each task: 1