Bug #626880

HALMD does not start with a huge number of particles

Added by Felix Höfling over 1 year ago. Updated 10 months ago.

Target version:
Start date:
Due date:
% Done:


Estimated time:


The construction of mdsim::gpu::particle fails if the particle number exceeds 8388480 = 65535 * 128. These compute dimensions are calculated upon member initialisation. While it may be sufficient to limit the number of blocks to 65535, it should be possible for the block size to use the maximum value (1024). On the other hand, a certain number of blocks is required for good device occupancy. (How many? #SMs times a small factor?) So fixing the block size to its maximum would be counterproductive.

1) device::validate() has not detected the mismatch.
2) for given particle number, we may try to start with the maximum block size and lower it until the number of blocks is sufficiently large. The challenge is to determine the number of SMs of a given device.

The deviceQuery from the SDK knows it (8 SMs for GTX960):

Device 0: "GeForce GTX 960" 
  CUDA Driver Version / Runtime Version          8.0 / 7.5
  CUDA Capability Major/Minor version number:    5.2
  Total amount of global memory:                 4038 MBytes (4233691136 bytes)
  ( 8) Multiprocessors, (128) CUDA Cores/MP:     1024 CUDA Cores

Perhaps a useful link:


#1 Updated by Felix Höfling over 1 year ago

  • Assignee set to Daniel Kirchner

#2 Updated by Felix Höfling 10 months ago

  • Status changed from New to Closed

The issue was addressed by commits 5fcece3, 4676901 and others.

Also available in: Atom PDF