コンテンツにスキップ

[Advance/NeuralMD Pro] Benchmarks on HPC#

We made benchmarks of Molecular Dynamics calculation using Neural Network Potential via LAMMPS. The system we used is the same as one used in the benchmarking on Mat3ra before. It is the supercell model of sulfide-type lithium ion conductor Li10GeP2S12, including 21600 atoms.

Calculation Environment#

  • CPU:AMD EPYC 7742 64-Core
  • GPU:NVIDIA A100 SXM4

HPC Systems cooperated in the preparation and use of the computing environment.

Additionally, we use GCC as compiler and OpenBLAS as linear algebra operation library, because the compilers and the libraries made by Intel is not suitable for the CPUs made by AMD, used in the machine we used this time.

Results of Benchmarking#

The conditions and results of the calculations are shown below. The calculations were done at the 5 conditions at the total; using only CPU, and using 1 – 8 GPUs. The MPI parallel number is set so that 4 MPI processes are started per 1 GPU.

CPU GPU x 1 GPU x 2 GPU x 4 GPU x 8
MPI
Process
64 4 8 16 32
OpenMP
Thread
1 2 2 2 2
GPU
Device
- 1 2 4 8
Calculation
Time / sec
56.4 16.0 8.3 4.4 2.5


The relative calculation speed when that using only CPU equal to 1 is shown in the below figure. The calculation speed became 3.5 times and 22.9 times faster by using 1 and 8 GPUs respectively.

Comparison with the Cloud Environment#

The relative calculation speed of the results of benchmarking on Mat3ra and in this time, which calculated based on the same criteria as above figure are shown in the left below figure. On Mat3ra, Intel Xeon Platinum series CPU, and NVIDIA V100 or NVIDIA P100 GPUs are used. For other conditions, please refer the corresponding page.

To compare between the cases used only CPU, AMD EPYC is about 3 times faster than Intel Xeon Platinum. In the cases used 8 GPUs with AMD EPYC, the over 1.5 times faster calculation speed was seen compared with the similar case with Intel Xeon Platinum. When this case was compared with the case used Intel Xeon Platinum CPU only, the calculation speed would be over 60 times faster.

Additionally, the calculation speed values against the number of GPUs is shown in the right below figure. In the all environments, the calculation speed seemed to increase in proportion to the increase of the number of GPUs.

関連ページ#