Aerodynamics of a cow

19.1K subscribers

2,684,249 views

About
Share

Published On Aug 19, 2022

This video may not change your life, but my FluidX3D software will, if you do research in CFD. On the same GPU(s) it's 100-2000x faster than expensive commercial FVM solvers. The entire source code is on GitHub: https://github.com/ProjectPhysX/FluidX3D

This 10s video shows 10s in real time with 1m/s wind speed. 476×952×476 #LBM grid (215 million voxels), 28k time steps, 23 minutes for compute+rendering on my PC with Titan Xp GPU.

How is it possible to squeeze 215 million grid points in only 12GB?
I'm using two techniques here, which together form the holy grail of lattice Boltzmann, cutting memory demand down to only 55 Bytes/node for D3Q19 LBM, or 1/6 of conventional LBM codes:

1. In-place streaming with Esoteric-Pull. This almost cuts memory demand in half and slightly increases performance due to implicit bounce-back boundaries.
Paper: https://doi.org/10.3390/computation10...

2. Decoupled arithmetic precision (FP32) and memory precision (FP16): all arithmetic is done in FP32, but LBM density distribution functions in memory are compressed to FP16. This almost cuts memory demand in half and almost doubles performance, without impacting overall accuracy for most setups.
Paper: https://www.researchgate.net/publicat...

Graphics are done directly in FluidX3D with OpenCL, with the raw simulation data already residing in ultra-fast video memory. No volumetric data (1 frame of the velocity field is 2.5GB!) ever has to be copied to the CPU or hard drive, but only rendered 1080p frames (8MB) instead. Once on the CPU side, a copy of the frame is made in memory and a thread is detached to handle the slow .png compression, all while the simulation is already continuing.
Paper: https://www.researchgate.net/publicat...

#CFD #GPU #FluidX3D #OpenCL

Published On Aug 19, 2022

Share/Embed

Video Link