PME-JAC_NVE on V100s PCIe 600 490.77 500 13.4X (Untuned on Volta) 400 Running AMBER version 16.8 ns/day The blue node contains Dual Intel Xeon 300 E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs 200 The green nodes contain Dual Intel Xeon E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs + 100 Tesla V100 PCIe (16GB) GPUs 36.53 0 1 Broadwell node 1 node + 2x V100 PCIe per node (16GB) 23
PME-JAC_NVE on V100s SXM2 700 583.33 600 539.78 500 (Untuned on Volta) 16.0X Running AMBER version 16.8 400 ns/day 14.8X The blue node contains Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] 300 (Broadwell) CPUs The green nodes contain Dual Intel 200 Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla V100 SXM2 (16GB) GPUs 100 36.53 0 1 Broadwell node 1 node + 1 node + 2x V100 SXM2 4x V100 SXM2 per node (16GB) per node (16GB) 24
PME-JAC_NPT_4fs on V100s PCIe 900 863.80 750 13.1X (Untuned on Volta) 600 Running AMBER version 16.8 ns/day The blue node contains Dual Intel Xeon 450 E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs 300 The green nodes contain Dual Intel Xeon E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs + 150 Tesla V100 PCIe (16GB) GPUs 65.74 0 1 Broadwell node 1 node + 2x V100 PCIe per node (16GB) 25
PME-JAC_NPT_4fs on V100s SXM2 1200 1006.32 1000 946.57 15.3X (Untuned on Volta) 800 Running AMBER version 16.8 14.4X ns/day The blue node contains Dual Intel Xeon 600 E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs 400 The green nodes contain Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + 200 Tesla V100 SXM2 (16GB) GPUs 65.74 0 1 Broadwell node 1 node + 1 node + 2x V100 SXM2 4x V100 SXM2 per node (16GB) per node (16GB) 26
PME-JAC_NVE_4fs on V100s PCIe 1050 940.32 900 750 26.0X (Untuned on Volta) Running AMBER version 16.8 600 ns/day The blue node contains Dual Intel Xeon E5-2690 v4@2.6GHz [3.5GHz Turbo] 450 (Broadwell) CPUs The green nodes contain Dual Intel 300 Xeon E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs + Tesla V100 PCIe (16GB) GPUs 150 67.10 0 1 Broadwell node 1 node + 2x V100 PCIe per node (16GB) 27
PME-JAC_NVE_4fs on V100s SXM2 1200 1123.40 1027.44 1000 16.7X (Untuned on Volta) 800 Running AMBER version 16.8 15.3X ns/day The blue node contains Dual Intel Xeon 600 E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs 400 The green nodes contain Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + 200 Tesla V100 SXM2 (16GB) GPUs 67.10 0 1 Broadwell node 1 node + 1 node + 2x V100 SXM2 4x V100 SXM2 per node (16GB) per node (16GB) 28
PME-STMV_NPT_4fs on V100s PCIe 35 33.21 30 31.3X 25 (Untuned on Volta) Running AMBER version 16.8 20 ns/day The blue node contains Dual Intel Xeon E5-2690 v4@2.6GHz [3.5GHz Turbo] 15 (Broadwell) CPUs The green nodes contain Dual Intel 10 Xeon E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs + Tesla V100 PCIe (16GB) GPUs 5 1.06 0 1 Broadwell node 1 node + 2x V100 PCIe per node (16GB) 29
PME-STMV_NPT_4fs on V100s SXM2 40 37.24 35 30 35.1X (Untuned on Volta) Running AMBER version 16.8 25 ns/day The blue node contains Dual Intel Xeon 20 E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs 15 The green nodes contain Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz 10 Turbo] (Broadwell) CPUs + Tesla V100 SXM2 (16GB) GPUs 5 1.06 0 1 Broadwell node 1 node + 2x V100 SXM2 per node (16GB) 30
GB-Myoglobin on V100s PCIe 750 699.21 600 31.4X (Untuned on Volta) Running AMBER version 16.8 450 ns/day The blue node contains Dual Intel Xeon E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs 300 The green nodes contain Dual Intel Xeon E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs + 150 Tesla V100 PCIe (16GB) GPUs 22.30 0 1 Broadwell node 1 node + 2x V100 PCIe per node (16GB) 31
GB-Myoglobin on V100s SXM2 800 750.76 700 33.7X 600 (Untuned on Volta) Running AMBER version 16.8 500 ns/day The blue node contains Dual Intel Xeon 400 E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs 300 The green nodes contain Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz 200 Turbo] (Broadwell) CPUs + Tesla V100 SXM2 (16GB) GPUs 100 22.30 0 1 Broadwell node 1 node + 2x V100 SXM2 per node (16GB) 32
GB-Nucleosome on V100s PCIe 85 78.39 68 252.9X (Untuned on Volta) Running AMBER version 16.8 49.14 51 ns/day The blue node contains Dual Intel Xeon 158.5X E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs 34 The green nodes contain Dual Intel Xeon E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs + 17 Tesla V100 PCIe (16GB) GPUs 0.31 0 1 Broadwell node 1 node + 1 node + 2x V100 PCIe 4x V100 PCIe per node (16GB) per node (16GB) 33
GB-Nucleosome on V100s SXM2 100 92.46 75 (Untuned on Volta) 298.3X Running AMBER version 16.8 52.89 ns/day The blue node contains Dual Intel Xeon 50 E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs 170.6X The green nodes contain Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz 25 Turbo] (Broadwell) CPUs + Tesla V100 SXM2 (16GB) GPUs 0.31 0 1 Broadwell node 1 node + 1 node + 2x V100 SXM2 4x V100 SXM2 per node (16GB) per node (16GB) 34
Rubisco on V100s PCIe 8 6.78 7 6 678.0X (Untuned on Volta) 5.22 Running AMBER version 16.8 5 ns/day The blue node contains Dual Intel Xeon 4 522.0X E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs 2.79 3 The green nodes contain Dual Intel Xeon E5-2690 v4@2.6GHz [3.5GHz 2 279.0X Turbo] (Broadwell) CPUs + Tesla V100 PCIe (16GB) GPUs 1 0.01 0 1 Broadwell node 1 node + 1 node + 1 node + 2x V100 PCIe 4x V100 PCIe 8x V100 PCIe per node (16GB) per node (16GB) per node (16GB) 35
Rubisco on V100s SXM2 8 7.00 7 5.96 6 700.0X (Untuned on Volta) Running AMBER version 16.8 5 ns/day The blue node contains Dual Intel Xeon 4 596.0X E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs 3.00 3 The green nodes contain Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz 2 300.0X Turbo] (Broadwell) CPUs + Tesla V100 SXM2 (16GB) GPUs 1 0.01 0 1 Broadwell node 1 node + 1 node + 1 node + 2x V100 SXM2 4x V100 SXM2 8x V100 SXM2 per node (16GB) per node (16GB) per node (16GB) 36
AMBER 16 February 2017
PME-Cellulose_NPT on P100s PCIe 40 PME-Cellulose_NPT 35 Running AMBER version 16.3 30.00 30 The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] 25 12.8X ns/day 21.85 (Broadwell) CPUs 20 9.3X The green nodes contain Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz 15 Turbo] (Broadwell) CPUs + Tesla P100 PCIe (16GB) GPUs 10 1x P100 PCIe is paired with Single ➢ Intel Xeon E5-2699 v4@2.2GHz 5 [3.6GHz Turbo] (Broadwell) 2.35 0 1 Broadwell node 1 node + 1 node + 1x P100 PCIe (16GB) 2x P100 PCIe (16GB) per node per node 38
PME-Cellulose_NPT on P100s SXM2 40 PME-Cellulose_NPT 36.65 35 32.22 15.6X Running AMBER version 16.3 30 13.7X The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] 25 23.37 (Broadwell) CPUs ns/day 20 9.9X The green nodes contain Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz 15 Turbo] (Broadwell) CPUs + Tesla P100 SXM2 GPUs 10 1x P100 SXM2 is paired with Single ➢ Intel Xeon E5-2698 v4@2.2GHz 5 [3.6GHz Turbo] (Broadwell) 2.35 0 1 Broadwell node 1 node + 1 node + 1 node + 1x P100 SXM2 2x P100 SXM2 4x P100 SXM2 per node per node per node 39
PME-Cellulose_NVE on P100s PCIe 40 PME-Cellulose_NVE 35 32.55 Running AMBER version 16.3 30 13.2X The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] 25 23.34 (Broadwell) CPUs ns/day 20 9.4X The green nodes contain Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz 15 Turbo] (Broadwell) CPUs + Tesla P100 PCIe (16GB) GPUs 10 1x P100 PCIe is paired with Single ➢ Intel Xeon E5-2699 v4@2.2GHz 5 [3.6GHz Turbo] (Broadwell) 2.47 0 1 Broadwell node 1 node + 1 node + 1x P100 PCIe (16GB) 2x P100 PCIe (16GB) per node per node 40
PME-Cellulose_NVE on P100s SXM2 45 PME-Cellulose_NVE 40.88 40 35.16 Running AMBER version 16.3 35 16.6X The blue node contains Dual Intel Xeon 30 14.2X E5-2699 v4@2.2GHz [3.6GHz Turbo] 24.94 (Broadwell) CPUs 25 ns/day The green nodes contain Dual Intel 20 10.1X Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 15 SXM2 GPUs 10 1x P100 SXM2 is paired with Single ➢ Intel Xeon E5-2698 v4@2.2GHz 5 [3.6GHz Turbo] (Broadwell) 2.47 0 1 Broadwell node 1 node + 1 node + 1 node + 1x P100 SXM2 2x P100 SXM2 4x P100 SXM2 per node per node per node 41
PME-FactorIX_NPT on P100s PCIe 140 132.86 PME-FactorIX_NPT 120 11.6X Running AMBER version 16.3 98.77 100 The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs 80 ns/day 8.6X The green nodes contain Dual Intel 60 Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 PCIe (16GB) GPUs 40 1x P100 PCIe is paired with Single ➢ Intel Xeon E5-2699 v4@2.2GHz 20 11.43 [3.6GHz Turbo] (Broadwell) 0 1 Broadwell node 1 node + 1 node + 1x P100 PCIe (16GB) 2x P100 PCIe (16GB) per node per node 42
PME-FactorIX_NPT on P100s SXM2 180 PME-FactorIX_NPT 159.80 160 14.0X 144.11 Running AMBER version 16.3 140 12.6X The blue node contains Dual Intel Xeon 120 E5-2699 v4@2.2GHz [3.6GHz Turbo] 106.25 (Broadwell) CPUs 100 9.3X ns/day The green nodes contain Dual Intel 80 Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 60 SXM2 GPUs 40 1x P100 SXM2 is paired with Single ➢ Intel Xeon E5-2698 v4@2.2GHz 20 [3.6GHz Turbo] (Broadwell) 11.43 0 1 Broadwell node 1 node + 1 node + 1 node + 1x P100 SXM2 2x P100 SXM2 4x P100 SXM2 per node per node per node 43
PME-FactorIX_NVE on P100s PCIe 160 PME-FactorIX_NVE 145.83 140 12.2X Running AMBER version 16.3 120 105.86 The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] 100 8.8X (Broadwell) CPUs ns/day 80 The green nodes contain Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz 60 Turbo] (Broadwell) CPUs + Tesla P100 PCIe (16GB) GPUs 40 1x P100 PCIe is paired with Single ➢ Intel Xeon E5-2699 v4@2.2GHz 20 11.98 [3.6GHz Turbo] (Broadwell) 0 1 Broadwell node 1 node + 1 node + 1x P100 PCIe (16GB) 2x P100 PCIe (16GB) per node per node 44
PME-FactorIX_NVE on P100s SXM2 200 PME-FactorIX_NVE 178.02 180 159.24 160 14.9X Running AMBER version 16.3 140 13.3X The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] 114.88 120 (Broadwell) CPUs ns/day 100 9.6X The green nodes contain Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz 80 Turbo] (Broadwell) CPUs + Tesla P100 SXM2 GPUs 60 1x P100 SXM2 is paired with Single ➢ 40 Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) 20 11.98 0 1 Broadwell node 1 node + 1 node + 1 node + 1x P100 SXM2 2x P100 SXM2 4x P100 SXM2 per node per node per node 45
PME-JAC_NPT on P100s PCIe 350 PME-JAC_NPT 327.69 300 283.60 7.1X Running AMBER version 16.3 250 6.2X The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs 200 ns/day The green nodes contain Dual Intel 150 Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 PCIe (16GB) GPUs 100 1x P100 PCIe is paired with Single ➢ 45.89 Intel Xeon E5-2699 v4@2.2GHz 50 [3.6GHz Turbo] (Broadwell) 0 1 Broadwell node 1 node + 1 node + 1x P100 PCIe (16GB) 2x P100 PCIe (16GB) per node per node 46
PME-JAC_NPT on P100s SXM2 450 PME-JAC_NPT 423.09 400 9.2X 360.64 Running AMBER version 16.3 350 7.9X 310.52 The blue node contains Dual Intel Xeon 300 6.8X E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs 250 ns/day The green nodes contain Dual Intel 200 Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 150 SXM2 GPUs 100 1x P100 SXM2 is paired with Single ➢ Intel Xeon E5-2698 v4@2.2GHz 45.89 50 [3.6GHz Turbo] (Broadwell) 0 1 Broadwell node 1 node + 1 node + 1 node + 1x P100 PCIe 2x P100 PCIe 4x P100 PCIe per node per node per node 47
PME-JAC_NVE on P100s PCIe 400 PME-JAC_NVE 363.79 350 7.6X 308.46 Running AMBER version 16.3 300 The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] 250 6.4X (Broadwell) CPUs ns/day 200 The green nodes contain Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz 150 Turbo] (Broadwell) CPUs + Tesla P100 PCIe (16GB) GPUs 100 1x P100 PCIe is paired with Single ➢ Intel Xeon E5-2699 v4@2.2GHz 47.90 50 [3.6GHz Turbo] (Broadwell) 0 1 Broadwell node 1 node + 1 node + 1x P100 PCIe (16GB) 2x P100 PCIe (16GB) per node per node 48
PME-JAC_NVE on P100s SXM2 500 PME-JAC_NVE 473.10 450 9.9X 402.18 400 Running AMBER version 16.3 8.4X 339.81 350 The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] 300 7.1X (Broadwell) CPUs ns/day 250 The green nodes contain Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz 200 Turbo] (Broadwell) CPUs + Tesla P100 SXM2 GPUs 150 1x P100 SXM2 is paired with Single ➢ 100 Intel Xeon E5-2698 v4@2.2GHz 47.90 [3.6GHz Turbo] (Broadwell) 50 0 1 Broadwell node 1 node + 1 node + 1 node + 1x P100 PCIe 2x P100 PCIe 4x P100 PCIe per node per node per node 49
GB-Myoglobin on P100s PCIe 600 GB-Myoglobin 561.94 483.37 500 19.5X Running AMBER version 16.3 The blue node contains Dual Intel Xeon 400 E5-2699 v4@2.2GHz [3.6GHz Turbo] 16.7X (Broadwell) CPUs ns/day 300 The green nodes contain Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 200 PCIe (16GB) GPUs 1x P100 PCIe is paired with Single ➢ 100 Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) 28.86 0 1 Broadwell node 1 node + 1 node + 1x P100 PCIe (16GB) 4x P100 PCIe (16GB) per node per node 50
GB-Myoglobin on P100s SXM2 700 GB-Myoglobin 639.37 600 22.2X 534.28 Running AMBER version 16.3 500 18.5X The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs 400 ns/day The green nodes contain Dual Intel 300 Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 SXM2 GPUs 200 1x P100 SXM2 is paired with Single ➢ Intel Xeon E5-2698 v4@2.2GHz 100 [3.6GHz Turbo] (Broadwell) 28.86 0 1 Broadwell node 1 node + 1 node + 1x P100 PCIe 4x P100 PCIe per node per node 51
GB-Nucleosome on P100s PCIe 50 GB-Nucleosome 45.92 45 114.8X 39.91 40 Running AMBER version 16.3 35 99.8X The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] 30 (Broadwell) CPUs ns/day 25 22.77 The green nodes contain Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz 20 56.9X Turbo] (Broadwell) CPUs + Tesla P100 PCIe (16GB) GPUs 15 11.91 1x P100 PCIe is paired with Single ➢ 10 29.8X Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) 5 0.40 0 1 Broadwell node 1 node + 1 node + 1 node + 1 node + 1x P100 PCIe 2x P100 PCIe 4x P100 PCIe 8x P100 PCIe (16GB) per node (16GB) per node (16GB) per node (16GB) per node 52
GB-Nucleosome on P100s SXM2 60 GB-Nucleosome 48.29 50 46.29 Running AMBER version 16.3 120.7X The blue node contains Dual Intel Xeon 40 E5-2699 v4@2.2GHz [3.6GHz Turbo] 115.7X (Broadwell) CPUs ns/day 30 The green nodes contain Dual Intel 25.53 Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 20 SXM2 GPUs 63.8X 13.36 1x P100 SXM2 is paired with Single ➢ 10 Intel Xeon E5-2698 v4@2.2GHz 33.4X [3.6GHz Turbo] (Broadwell) 0.40 0 1 Broadwell node 1 node + 1 node + 1 node + 1 node + 1x P100 SXM2 2x P100 SXM2 4x P100 SXM2 8x P100 SXM2 per node per node per node per node 53
Rubisco-75K on P100s PCIe 4.5 Rubisco-75K 4.20 4.0 420.0X Running AMBER version 16.3 3.5 The blue node contains Dual Intel Xeon 3.0 2.69 E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs 2.5 ns/day 269.0X The green nodes contain Dual Intel 2.0 Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 1.40 1.5 PCIe (16GB) GPUs 1.0 1x P100 PCIe is paired with Single 140.0X ➢ 0.71 Intel Xeon E5-2699 v4@2.2GHz 0.5 [3.6GHz Turbo] (Broadwell) 71.0X 0.01 0.0 1 Broadwell node 1 node + 1 node + 1 node + 1 node + 1x P100 PCIe 2x P100 PCIe 4x P100 PCIe 8x P100 PCIe (16GB) per node (16GB) per node (16GB) per node (16GB) per node 54
Rubisco-75K on P100s SXM2 5.0 Rubisco-75K 4.46 4.5 4.0 Running AMBER version 16.3 446.0X 3.5 The blue node contains Dual Intel Xeon 3.06 E5-2699 v4@2.2GHz [3.6GHz Turbo] 3.0 306.0X (Broadwell) CPUs ns/day 2.5 The green nodes contain Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz 2.0 Turbo] (Broadwell) CPUs + Tesla P100 1.57 SXM2 GPUs 1.5 157.0X 1x P100 SXM2 is paired with Single ➢ 1.0 0.80 Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) 0.5 80.0X 0.01 0.0 1 Broadwell node 1 node + 1 node + 1 node + 1 node + 1x P100 SXM2 2x P100 SXM2 4x P100 SXM2 8x P100 SXM2 per node per node per node per node 55
Recommended GPU Node Configuration for AMBER Computational Chemistry Workstation or Single Node Configuration # of CPU sockets 2 Cores per CPU socket 6+ (1 CPU core drives 1 GPU) CPU speed (Ghz) 2.66+ System memory per node (GB) 16 GPUs P100, V100 1-4 # of GPUs per CPU socket GPU memory preference (GB) 6 GPU to CPU connection PCIe 3.0 16x or higher Server storage 2 TB Network configuration Infiniband QDR or better Scale to multiple nodes with same single node configuration 56 56
CHARMM DOMDEC-GUI July 2016
CHARMM DOMDEC-GUI 465 K System Benchmark 4 465 K System (Her1_HER1_membrane) Running CHARMM version c40a1 3 *Higher is better The blue node contains Dual Intel Xeon E5-2698 v3@2.30 GHz (Haswell) CPUs 2.15 ns/day The green nodes contain Dual Intel 2 Xeon E5-2698 v3@2.30 GHz (Haswell) CPUs + Tesla K80 (autoboost) GPUs 6.0X 1 Benchmarks were done based on the STANDARD CHARMM c40a1 version by the Yang group (FSU), who is responsible for possible benchmarking error. 0.36 0 1 Haswell node 1 node + 1x K80 per node 58
CHARMM DOMDEC-GUI 534 K System Benchmark 2.0 534 K System (POPC_PSPC_CHL1mixture) Running CHARMM version c40a1 1.5 1.43 *Higher is better The blue node contains Dual Intel Xeon E5-2698 v3@2.30 GHz (Haswell) CPUs The green nodes contain Dual Intel ns/day 1.0 Xeon E5-2698 v3@2.30 GHz (Haswell) CPUs + Tesla K80 (autoboost) GPUs 8.0X 0.5 Benchmarks were done based on the STANDARD CHARMM c40a1 version by the Yang group (FSU), who is responsible for possible benchmarking error. 0.18 0.0 1 Haswell node 1 node + 1x K80 per node 59
CHARMM DOMDEC-GUI 20 K System Benchmark 80 20 K System (Crambin) Running CHARMM version c40a1 59.68 60 *Higher is better The blue node contains Dual Intel Xeon E5-2698 v3@2.30 GHz (Haswell) CPUs ns/day The green nodes contain Dual Intel 40 Xeon E5-2698 v3@2.30 GHz (Haswell) CPUs + Tesla M40 GPUs 3.7X 20 Benchmarks were done based on the STANDARD 16.00 CHARMM c40a1 version by the Yang group (FSU), who is responsible for possible benchmarking error. 0 1 Haswell node 1 node + 1x M40 per node 60
CHARMM DOMDEC-GUI 61 K System Benchmark 35 61 K System (GlnBP) 30 *Higher is better Running CHARMM version c40a1 25.08 25 The blue node contains Dual Intel Xeon E5-2698 v3@2.30 GHz (Haswell) CPUs 20 6.4X The green nodes contain Dual Intel ns/day Xeon E5-2698 v3@2.30 GHz (Haswell) CPUs + Tesla M40 GPUs 15 10 Benchmarks were done based on the STANDARD CHARMM c40a1 version by the Yang group (FSU), who is responsible for possible benchmarking error. 5 3.90 0 1 Haswell node 1 node + 1x M40 per node 61
CHARMM DOMDEC-GUI 465 K System Benchmark 4 465 K System (Her1_HER1_membrane) Running CHARMM version c40a1 3 *Higher is better The blue node contains Dual Intel Xeon E5-2698 v3@2.30 GHz (Haswell) CPUs 2.27 The green nodes contain Dual Intel ns/day 2 Xeon E5-2698 v3@2.30 GHz (Haswell) CPUs + Tesla M40 GPUs 6.3X 1 Benchmarks were done based on the STANDARD CHARMM c40a1 version by the Yang group (FSU), who is responsible for possible benchmarking error. 0.36 0 1 Haswell node 1 node + 1x M40 per node 62
GROMACS 2016.4 October 2017
Water 1.5M on P100s PCIe 8 7.30 7 3.2X 6 (Untuned on Volta) 5.34 Running GROMACS version 2016.4 5 2.3X The blue node contains Dual Intel Xeon ns/day 4 E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs 3 The green nodes contain Dual Intel 2.28 Xeon E5-2690 v4@2.6GHz [3.5GHz 2 Turbo] (Broadwell) CPUs + Tesla V100 PCIe (16GB) GPUs 1 0 1 Broadwell node 1 node + 1 node + 2x V100 PCIe 2x V100 PCIe per node (16GB) per node (16GB) 64
Water 3M on P100s PCIe 5 3.85 4 4 3.4X (Untuned on Volta) 3 Running GROMACS version 2016.4 2.53 3 The blue node contains Dual Intel Xeon ns/day E5-2690 v4@2.6GHz [3.5GHz Turbo] 2 (Broadwell) CPUs 2.3X The green nodes contain Dual Intel 2 Xeon E5-2690 v4@2.6GHz [3.5GHz 1.12 Turbo] (Broadwell) CPUs + 1 Tesla V100 PCIe (16GB) GPUs 1 0 1 Broadwell node 1 node + 1 node + 2x V100 PCIe 2x V100 PCIe per node (16GB) per node (16GB) 65
GROMACS 2016 October 2016
Water 1.5M on P100 PCIes 8 Water 1.5M 7.11 7 2.5X 6.34 6 2.3X Running GROMACS version 2016 5 The blue node contains Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] ns/day 4 (Broadwell) CPUs The green nodes contain Dual Intel 2.79 3 Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + 2 Tesla P100 PCIe (16GB) GPUs 1 0 1 Broadwell node 1 node + 2x P100 PCIe (16GB) 1 node + 4x P100 PCIe (16GB) per node per node 67 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
Water 3M on P100 PCIes 4.0 Water 3M 3.43 3.5 2.6X 3.16 3.0 2.4X Running GROMACS version 2016 2.5 The blue node contains Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] ns/day 2.0 (Broadwell) CPUs The green nodes contain Dual Intel 1.5 1.32 Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + 1.0 Tesla P100 PCIe (16GB) GPUs 0.5 0.0 1 Broadwell node 1 node + 2x P100 PCIe (16GB) 1 node + 4x P100 PCIe (16GB) per node per node 68 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
GROMACS 5.1.2 February 2017
Water 1.5M on P100s PCIe 10 Water 1.5M 8 Running GROMACS version 5.1.2 7.21 6.96 The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] 6 (Broadwell) CPUs 2.4X ns/day 2.3X 4.39 The green nodes contain Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz 4 Turbo] (Broadwell) CPUs + Tesla P100 3.04 1.4X PCIe (16GB) GPUs 1x P100 PCIe is paired with Single ➢ 2 Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) 0 1 Broadwell node 1 node + 1 node + 1 node + 1x P100 PCIe (16GB) 2x P100 PCIe (16GB) 4x P100 PCIe (16GB) per node per node per node 70
Water 1.5M on P100s SXM2 9 Water 1.5M 7.88 8 2.6X 7.18 Running GROMACS version 5.1.2 7 6.70 2.4X The blue node contains Dual Intel Xeon 6 2.2X E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs 5 ns/day 4.11 The green nodes contain Dual Intel 4 Xeon E5-2698 v4@2.2GHz [3.6GHz 1.4X 3.04 Turbo] (Broadwell) CPUs + Tesla P100 3 SXM2 GPUs 2 1x P100 SXM2 is paired with Single ➢ Intel Xeon E5-2698 v4@2.2GHz 1 [3.6GHz Turbo] (Broadwell) 0 1 Broadwell node 1 node + 1 node + 1 node + 1 node + 1x P100 SXM2 2x 100 SXM2 4x P100 SXM2 8x P100 SXM2 per node per node per node per node 71
Water 3M on P100s PCIe 4.0 3.80 Water 3M 3.43 3.5 2.8X Running GROMACS version 5.1.2 3.0 2.5X The blue node contains Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz Turbo] 2.5 (Broadwell) CPUs ns/day 1.96 2.0 The green nodes contain Dual Intel Xeon E5-2699 v4@2.2GHz [3.6GHz 1.38 1.5 Turbo] (Broadwell) CPUs + Tesla P100 1.4X PCIe (16GB) GPUs 1.0 1x P100 PCIe is paired with Single ➢ Intel Xeon E5-2699 v4@2.2GHz 0.5 [3.6GHz Turbo] (Broadwell) 0.0 1 Broadwell node 1 node + 1 node + 1 node + 1x P100 PCIe (16GB) 2x P100 PCIe (16GB) 4x P100 PCIe (16GB) per node per node per node 72
Water 3M on P100s SXM2 4.5 Water 3M 4.0 3.82 3.50 Running GROMACS version 5.1.2 3.5 2.8X The blue node contains Dual Intel Xeon 3.0 2.5X E5-2699 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs 2.5 ns/day The green nodes contain Dual Intel 2.0 1.84 Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla P100 1.38 1.5 SXM2 GPUs 1.3X 1.0 1x P100 SXM2 is paired with Single ➢ Intel Xeon E5-2698 v4@2.2GHz 0.5 [3.6GHz Turbo] (Broadwell) 0.0 1 Broadwell node 1 node + 1 node + 1 node + 1x P100 SXM2 2x P100 SXM2 4x P100 SXM2 per node per node per node 73
Recommended GPU Node Configuration for GROMACS Computational Chemistry Workstation or Single Node Configuration # of CPU sockets 2 Cores per CPU socket 6+ CPU speed (Ghz) 2.66+ System memory per socket (GB) 32 GPUs Tesla P100, V100 1x # of GPUs per CPU socket Kepler GPUs: need fast Sandy Bridge or Ivy Bridge, or high-end AMD Opterons GPU memory preference (GB) 6 GPU to CPU connection PCIe 3.0 or higher Server storage 500 GB or higher Network configuration Gemini, InfiniBand 74 74
HOOMD-Blue 2.1.6 September 2017
lj-liquid on V100s PS PCIe 4500 lj-liquid 3890.73 4000 3500 (Untuned on Volta) Running HOOMD-Blue version 2.1.6 3000 2730.94 The blue node contains Dual Intel Xeon ns/day 2500 E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs 2000 The green nodes contain Dual Intel 1500 Xeon E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs + Tesla P100 1000 PCIe (16GB) or V100 PS PCIe (16GB) GPUs 500 238.47 0 1 Broadwell node 1 node + 1 node + 2x P100 PCIe 2x V100 PS PCIe per node (16GB) per node (16GB) 76
microsphere on V100s PS PCIe microsphere 500 466.88 450 400 371.06 360.26 350 (Untuned on Volta) 298.15 Running HOOMD-Blue version 2.1.6 300 262.79 The blue node contains Dual Intel Xeon ns/day 250 E5-2690 v4@2.6GHz [3.5GHz Turbo] 182.20 200 (Broadwell) CPUs 150 The green nodes contain Dual Intel Xeon E5-2690 v4@2.6GHz [3.5GHz 100 Turbo] (Broadwell) CPUs + Tesla P100 50 PCIe (16GB) or V100 PS PCIe (16GB) 9.79 GPUs 0 1 Broadwell 1 node + 1 node + 1 node + 1 node + 1 node + 1 node + node 2x P100 4x P100 8x P100 2x V100 PS 4x V100 PS 8x V100 PS PCIe PCIe PCIe PCIe PCIe PCIe per node per node per node per node per node per node (16GB) (16GB) (16GB) (16GB) (16GB) (16GB) 77
quasicrystal on V100s PS PCIe 3000 quasicrystal 2530.74 2500 2371.16 (Untuned on Volta) Running HOOMD-Blue version 2.1.6 2000 1819.76 The blue node contains Dual Intel Xeon ns/day E5-2690 v4@2.6GHz [3.5GHz Turbo] 1500 (Broadwell) CPUs 1184.14 The green nodes contain Dual Intel 1000 Xeon E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs + Tesla P100 PCIe (16GB) or V100 PS PCIe (16GB) 500 GPUs 52.82 0 1 Broadwell 1 node + 1 node + 1 node + 1 node + node 2x P100 PCIe 4x P100 PCIe 2x V100 PS PCIe 4x V100 PS PCIe per node (16GB) per node (16GB) per node (16GB) per node (16GB) 78
triblock-copolymer on V100s PS PCIe 3000 triblock-copolymer 2761.75 2500 (Untuned on Volta) 1972.93 Running HOOMD-Blue version 2.1.6 2000 The blue node contains Dual Intel Xeon ns/day E5-2690 v4@2.6GHz [3.5GHz Turbo] 1500 (Broadwell) CPUs The green nodes contain Dual Intel 1000 Xeon E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs + Tesla P100 PCIe (16GB) or V100 PS PCIe (16GB) 500 GPUs 234.71 0 1 Broadwell node 1 node + 1 node + 2x P100 PCIe 2x V100 PS PCIe per node (16GB) per node (16GB) 79
dodecahedron on V100s PS PCIe 350 dodecahedron 293.25 300 277.85 250 (Untuned on Volta) 226.39 Running HOOMD-Blue version 2.1.6 196.18 200 172.28 The blue node contains Dual Intel Xeon ns/day E5-2690 v4@2.6GHz [3.5GHz Turbo] 150 (Broadwell) CPUs 121.49 100 The green nodes contain Dual Intel Xeon E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs + Tesla P100 50 25.84 PCIe (16GB) or V100 PS PCIe (16GB) GPUs 0 1 Broadwell 1 node + 1 node + 1 node + 1 node + 1 node + 1 node + node 2x P100 4x P100 8x P100 2x V100 PS 4x V100 PS 8x V100 PS PCIe PCIe PCIe PCIe PCIe PCIe per node per node per node per node per node per node (16GB) (16GB) (16GB) (16GB) (16GB) (16GB) 80
hexagon on V100s PS PCIe 140 hexagon 126.70 120 102.16 100 (Untuned on Volta) Running HOOMD-Blue version 2.1.6 80 69.55 The blue node contains Dual Intel Xeon ns/day E5-2690 v4@2.6GHz [3.5GHz Turbo] 55.15 60 (Broadwell) CPUs 37.30 40 The green nodes contain Dual Intel 30.30 Xeon E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs + Tesla P100 20 PCIe (16GB) or V100 PS PCIe (16GB) 6.33 GPUs 0 1 Broadwell 1 node + 1 node + 1 node + 1 node + 1 node + 1 node + node 2x P100 4x P100 8x P100 2x V100 PS 4x V100 PS 8x V100 PS PCIe PCIe PCIe PCIe PCIe PCIe per node per node per node per node per node per node (16GB) (16GB) (16GB) (16GB) (16GB) (16GB) 81
HOOMD-Blue 2.1.6 October 2017
lj-liquid on V100s PCIe 4500 3890.73 4000 3500 16.3X (Untuned on Volta) 3000 Running HOOMD-Blue version 2.1.6 Average TPS 2500 The blue node contains Dual Intel Xeon E5-2690 v4@2.6GHz [3.5GHz Turbo] 2000 (Broadwell) CPUs 1500 The green nodes contain Dual Intel Xeon E5-2690 v4@2.6GHz [3.5GHz 1000 Turbo] (Broadwell) CPUs + Tesla V100 PCIe (16GB) GPUs 500 238.47 0 1 Broadwell node 1 node + 2x V100 PCIe per node (16GB) 83
lj-liquid on V100s SXM2 5000 4435.12 4500 4285.59 4000 18.6X 3500 (Untuned on Volta) 18.0X Running HOOMD-Blue version 2.1.6 3000 Average TPS The blue node contains Dual Intel Xeon 2500 E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs 2000 The green nodes contain Dual Intel 1500 Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + 1000 Tesla V100 SXM2 (16GB) GPUs 500 238.47 0 1 Broadwell node 1 node + 1 node + 2x V100 SXM2 4x V100 SXM2 per node (16GB) per node (16GB) 84
microsphere on V100 PCIe 500 466.88 450 47.7X 400 371.06 37.9X 350 (Untuned on Volta) Running HOOMD-Blue version 2.1.6 298.15 300 Average TPS The blue node contains Dual Intel Xeon 250 30.5X E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs 200 The green nodes contain Dual Intel 150 Xeon E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs + 100 Tesla V100 PCIe (16GB) GPUs 50 9.79 0 1 Broadwell node 1 node + 1 node + 1 node + 2x V100 PCIe 4x V100 PCIe 8x V100 PCIe per node (16GB) per node (16GB) per node (16GB) 85
microsphere on V100s SXM2 800 688.99 700 600 70.4X (Untuned on Volta) 506.09 Running HOOMD-Blue version 2.1.6 500 51.7X Average TPS The blue node contains Dual Intel Xeon 400 E5-2698 v4@2.2GHz [3.6GHz Turbo] 329.43 (Broadwell) CPUs 300 33.6X The green nodes contain Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz 200 Turbo] (Broadwell) CPUs + Tesla V100 SXM2 (16GB) GPUs 100 9.79 0 1 Broadwell node 1 node + 1 node + 1 node + 2x V100 SXM2 4x V100 SXM2 8x V100 SXM2 per node (16GB) per node (16GB) per node (16GB) 86
quasicrystal on V100s PCIe 3000 2530.74 2500 2371.16 47.9X (Untuned on Volta) 2000 Running HOOMD-Blue version 2.1.6 44.9X Average TPS The blue node contains Dual Intel Xeon 1500 E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs 1000 The green nodes contain Dual Intel Xeon E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs + 500 Tesla V100 PCIe (16GB) GPUs 52.82 0 1 Broadwell node 1 node + 1 node + 2x V100 PCIe 4x V100 PCIe per node (16GB) per node (16GB) 87
quasicrystal on V100s SXM2 3500 3015.42 3000 57.1X 2546.38 2500 (Untuned on Volta) 48.2X Running HOOMD-Blue version 2.1.6 2000 Average TPS The blue node contains Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] 1500 (Broadwell) CPUs The green nodes contain Dual Intel 1000 Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla V100 SXM2 (16GB) GPUs 500 52.82 0 1 Broadwell node 1 node + 1 node + 2x V100 SXM2 4x V100 SXM2 per node (16GB) per node (16GB) 88
triblock-copolymer on V100s PCIe 3000 2761.75 2500 11.8X (Untuned on Volta) 2000 Running HOOMD-Blue version 2.1.6 Average TPS The blue node contains Dual Intel Xeon 1500 E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs 1000 The green nodes contain Dual Intel Xeon E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs + 500 Tesla V100 PCIe (16GB) GPUs 234.71 0 1 Broadwell node 1 node + 2x V100 PCIe per node (16GB) 89
triblock-copolymer on V100s SXM2 3500 3188.84 2958.60 3000 13.6X 2500 12.6X (Untuned on Volta) Running HOOMD-Blue version 2.1.6 2000 Average TPS The blue node contains Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] 1500 (Broadwell) CPUs The green nodes contain Dual Intel 1000 Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla V100 SXM2 (16GB) GPUs 500 234.71 0 1 Broadwell node 1 node + 1 node + 2x V100 SXM2 4x V100 SXM2 per node (16GB) per node (16GB) 90
dodecahedron on V100s PCIe 350 293.25 300 277.85 11.3X 250 (Untuned on Volta) 10.8X Running HOOMD-Blue version 2.1.6 200 Average TPS The blue node contains Dual Intel Xeon 172.28 E5-2690 v4@2.6GHz [3.5GHz Turbo] 150 (Broadwell) CPUs 6.7X The green nodes contain Dual Intel 100 Xeon E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs + Tesla V100 PCIe (16GB) GPUs 50 25.84 0 1 Broadwell node 1 node + 1 node + 1 node + 2x V100 PCIe 4x V100 PCIe 8x V100 PCIe per node (16GB) per node (16GB) per node (16GB) 91
dodecahedron on V100s SXM2 350 317.00 309.65 300 12.3X 250 12.0X (Untuned on Volta) Running HOOMD-Blue version 2.1.6 Average TPS 200 179.94 The blue node contains Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] 150 (Broadwell) CPUs 7.0X The green nodes contain Dual Intel 100 Xeon E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs + Tesla V100 SXM2 (16GB) GPUs 50 25.84 0 1 Broadwell node 1 node + 1 node + 1 node + 2x V100 SXM2 4x V100 SXM2 8x V100 SXM2 per node (16GB) per node (16GB) per node (16GB) 92
hexagon on V100s PCIe 140 126.70 120 20.0X 100 (Untuned on Volta) Running HOOMD-Blue version 2.1.6 80 Average TPS The blue node contains Dual Intel Xeon 69.55 E5-2690 v4@2.6GHz [3.5GHz Turbo] 60 (Broadwell) CPUs 11.0X The green nodes contain Dual Intel 37.30 40 Xeon E5-2690 v4@2.6GHz [3.5GHz 5.9X Turbo] (Broadwell) CPUs + Tesla V100 PCIe (16GB) GPUs 20 6.33 0 1 Broadwell node 1 node + 1 node + 1 node + 2x V100 PCIe 4x V100 PCIe 8x V100 PCIe per node (16GB) per node (16GB) per node (16GB) 93
hexagon on V100s SXM2 140 119.50 120 18.9X 100 (Untuned on Volta) Running HOOMD-Blue version 2.1.6 80 Average TPS The blue node contains Dual Intel Xeon 69.08 E5-2698 v4@2.2GHz [3.6GHz Turbo] 60 (Broadwell) CPUs 10.9X The green nodes contain Dual Intel 38.70 40 Xeon E5-2698 v4@2.2GHz [3.6GHz 6.1X Turbo] (Broadwell) CPUs + Tesla V100 SXM2 (16GB) GPUs 20 6.33 0 1 Broadwell node 1 node + 1 node + 1 node + 2x V100 SXM2 4x V100 SXM2 8x V100 SXM2 per node (16GB) per node (16GB) per node (16GB) 94
LAMMPS 2017 October 2017
Atomic-Fluid Lennard-Jones 2.5 Cutoff on V100s PCIe 1.00 2,048,000 atoms 0.80 0.73 (Untuned on Volta) Running LAMMPS version 2017 0.60 The blue node contains Dual Intel Xeon 3.0X 1/seconds E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs 0.40 The green nodes contain Dual Intel Xeon E5-2690 v4@2.6GHz [3.5GHz 0.25 Turbo] (Broadwell) CPUs + 0.20 Tesla V100 PCIe (16GB) GPUs 0.00 1 Broadwell node 1 node + 2x V100 PCIe per node (16GB) 96
Atomic-Fluid Lennard-Jones 2.5 Cutoff on V100s SXM2 1.00 2,048,000 atoms 0.82 0.80 3.3X (Untuned on Volta) Running LAMMPS version 2017 0.60 The blue node contains Dual Intel Xeon 1/seconds E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs 0.40 The green nodes contain Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz 0.25 Turbo] (Broadwell) CPUs + 0.20 Tesla V100 SXM2 (16GB) GPUs 0.00 1 Broadwell node 1 node + 2x V100 SXM2 per node (16GB) 97
Atomic-Fluid Lennard-Jones 5.0 Cutoff on V100s PCIe 0.80 2,048,000 atoms 0.60 0.60 (Untuned on Volta) 10.0X Running LAMMPS version 2017 0.47 0.45 The blue node contains Dual Intel Xeon 1/seconds 0.40 E5-2690 v4@2.6GHz [3.5GHz Turbo] 7.8X (Broadwell) CPUs 7.5X The green nodes contain Dual Intel Xeon E5-2690 v4@2.6GHz [3.5GHz 0.20 Turbo] (Broadwell) CPUs + Tesla V100 PCIe (16GB) GPUs 0.06 0.00 1 Broadwell node 1 node + 1 node + 1 node + 2x V100 PCIe 4x V100 PCIe 8x V100 PCIe per node (16GB) per node (16GB) per node (16GB) 98
Atomic-Fluid Lennard-Jones 5.0 Cutoff on V100s SXM2 0.80 2,048,000 atoms 0.60 0.56 0.55 (Untuned on Volta) 9.3X Running LAMMPS version 2017 0.48 9.2X The blue node contains Dual Intel Xeon 1/seconds 0.40 E5-2698 v4@2.2GHz [3.6GHz Turbo] (Broadwell) CPUs 8.0X The green nodes contain Dual Intel Xeon E5-2698 v4@2.2GHz [3.6GHz 0.20 Turbo] (Broadwell) CPUs + Tesla V100 SXM2 (16GB) GPUs 0.06 0.00 1 Broadwell node 1 node + 1 node + 1 node + 2x V100 SXM2 4x V100 SXM2 8x V100 SXM2 per node (16GB) per node (16GB) per node (16GB) 99
Course-grain Water on V100s PCIe 0.020 2,048,000 atoms 0.016 0.015 (Untuned on Volta) 5.3X Running LAMMPS version 2017 0.011 The blue node contains Dual Intel Xeon 1/seconds 0.010 E5-2690 v4@2.6GHz [3.5GHz Turbo] (Broadwell) CPUs 3.7X 0.007 The green nodes contain Dual Intel Xeon E5-2690 v4@2.6GHz [3.5GHz 0.005 Turbo] (Broadwell) CPUs + 0.003 2.3X Tesla V100 PCIe (16GB) GPUs 0.000 1 Broadwell node 1 node + 1 node + 1 node + 2x V100 PCIe 4x V100 PCIe 8x V100 PCIe per node (16GB) per node (16GB) per node (16GB) 100
Recommend
More recommend