-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce large floating point accumulation error in high photon simulations #41
Comments
fangq
added a commit
that referenced
this issue
Jul 21, 2018
ShijieYan
added a commit
to ShijieYan/mcx
that referenced
this issue
Jan 14, 2019
fangq
added a commit
that referenced
this issue
Jan 15, 2019
fix photon sharing normalization and issue #41 for WP/DCS output
fangq
added a commit
that referenced
this issue
Feb 21, 2019
jdtatz
pushed a commit
to jdtatz/mcx
that referenced
this issue
Jul 15, 2020
jdtatz
pushed a commit
to jdtatz/mcx
that referenced
this issue
Jul 15, 2020
jdtatz
pushed a commit
to jdtatz/mcx
that referenced
this issue
Jul 15, 2020
jdtatz
pushed a commit
to jdtatz/mcx
that referenced
this issue
Jul 15, 2020
fix photon sharing normalization and issue fangq#41 for WP/DCS output
jdtatz
pushed a commit
to jdtatz/mcx
that referenced
this issue
Jul 15, 2020
fangq
added a commit
to fangq/mcxcl
that referenced
this issue
Sep 21, 2022
ShijieYan
added a commit
to ShijieYan/mmc
that referenced
this issue
Sep 21, 2022
fangq
added a commit
to fangq/mmc
that referenced
this issue
Sep 21, 2022
add double-buffer to solve fangq/mcx#41
fangq
added a commit
that referenced
this issue
Oct 31, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
MCX has been using atomic operations for fluence accumulation by default since a few years ago. However, a drop in fluence intensity in large photon simulations has been observed. For example, running the below script using the current MCX github code, you can get the below plot
The reason for the drop in intensity was not due to data racing, like the case when non-atomic operations were used, but the accumulations of the round-off errors. In the region near the source, the energy deposit quickly increases to a large value. When adding a new energy deposit (which is a very small value) on top of a large value, the accuracy becomes a problem.
This is a serious problem because, with the increase of GPU computing capacity, most people would choose to run large photon simulations. We must be able to run large photon numbers without loosing accuracy.
There are a few solutions to such problem.
The easiest solution is to change the energy storage to double. However, consumer GPUs have extremely poor double performance, so moving to double precision addition can likely lead to drop in speed.
The standard way to sum a small values with a large floating point value is the Kahan summation. This is what we used in MMC. However, this requires multiple step operations with additional storage space. When combining with the atomic operation, atomic Kahan summation is very difficult to be implemented in the GPU.
Another idea is to use repetitions (-r) to split a large simulation into smaller chunks, and sum the solutions together. For example, for 1e9 photons with 10 respin, we run 10x 10^8 photon simulations. This can reduce the round-off error, but the repeated launch of the kernel causes a large overhead, sometimes, significantly higher than the kernel execution itself. In addition, even simulate at 1e8 photons, from the above plot, the drop in intensity remains noticeable.
A robust method is needed to obtain stable and converging solution especially at large photon numbers.
The text was updated successfully, but these errors were encountered: