コンテンツにスキップ

Advance/NeuralMD Documentation

[Advance/NeuralMD Pro] Benchmarks on the machine with 8 NVIDIA H200 GPUs#

We performed a benchmark of molecular dynamics calculations using LAMMPS with a Neural Network potential on a machine equipped with 8 GPUs (NVIDIA H200).

The subjects were the same as in previous benchmarks: 21,600-atom and 98,000-atom supercell models of the sulfide lithium-ion conductor Li10GeP2S12.

Computational Environment and MD Calculation Conditions#

The specifications of the computer used in this case study are shown below.

  • CPU: Intel Xeon Platinum 8480+ (56 cores) ×2
  • GPU: NVIDIA H200 ×8
  • CUDA: 12.4

The computational environment was created using the GPU cloud service "GPUSOROBAN" with the cooperation of HIGHRESO Co., Ltd.

Using LAMMPS 2Aug2023 (AdvanceSoft-modified version, bundled with Advance/NanoLabo Tool), we ran molecular dynamics calculations for 21,600-atom and 98,000-atom systems of Li10GeP2S12 with a force field created by NeuralMD. The pre-trained force field files, including the one used in this study, are available in the Force Field Database.

We performed a 100-step calculation in the NVT ensemble at 500 K, with a time step of 0.5 fs. Additionally, we calculated the number of days required to perform a similar molecular dynamics calculation for 1 ns based on the results.

Benchmark Results#

The calculation conditions and results are shown in the table below. Calculations were performed under five conditions: CPU only, and using 1 to 8 GPU devices. The number of MPI processes was set to 4 per GPU device.

CPU GPU×1 GPU×2 GPU×4 GPU×8
Number of MPI processes 56 4 8 16 32
Number of OpenMP threads 1 2 2 2 2
Number of GPU devices 0 1 2 4 8
Calculation time (Looptime/s) 21600 atoms 6.78 3.34 1.86 1.03 0.72
Calculation time (Looptime/s) 98000 atoms 27.92 15.07 7.65 4.11 2.43
ns/day 21600 atoms 0.64 1.29 2.32 4.20 6.04
ns/day 98000 atoms 0.16 0.29 0.57 1.05 1.78

The figure below shows the relative calculation speed with the CPU-only calculation speed set to 1. An acceleration of about 2 times was observed with 1 GPU device, and about 10-12 times with 8 devices.

Comparison with A100#

For this benchmark, we also performed calculations using a single NVIDIA A100 80GB device and have calculated its relative computational speed on the same basis for comparison.

While the A100 achieves a computational speed comparable to the 56-core CPU, the H200 shows a steady improvement in performance, demonstrating that NeuralMD can leverage the performance of the newer generation GPU to achieve high calculation speeds. This indicates that a system equipped with multiple new-generation GPUs is extremely effective for molecular dynamics calculations using NeuralMD, showing it to be a tool that enables larger and more complex simulations.

関連ページ#