Skip to content

Commit

Permalink
partially update documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
fangq committed Oct 29, 2019
1 parent a9c4732 commit 4d7d94f
Show file tree
Hide file tree
Showing 5 changed files with 216 additions and 49 deletions.
133 changes: 133 additions & 0 deletions ChangeLog.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,138 @@
Change Log

== MMC 1.9 (v2019.10, Moon Cake - alpha), Qianqian Fang <q.fang at neu.edu> ==

2019-10-29 [a9c4732] rename mmcl executable to mmc
2019-10-29*[0c3f19f] Merge branch 'mmcl' to 'master', now mmcl is official!
2019-10-27 [206bedf] fix skinvessel example
2019-10-18 [a3823b8] add the missing -d 1 flag
2019-10-14 [4940761] fix end-of-line markers
2019-10-14 [2139721] fix example file permissions and end-of-line markers
2019-10-12 [373459a] remove commented lines
2019-10-12 [e321340] add initial element search for wide-field sources; update mmcl examples
2019-10-11 [8e6f561] bug fixing; update mmclab examples
2019-10-11*[1a8e81f] bug fixing; add support of surface diffuse reflectance for mmcl
2019-10-07 [3262326] resolve some compiling issues,e.g. missing argument in functions; missing fields in data struct
2019-10-07 [261fe69] manually resolve merge conflicts
2019-09-09 [9a2ad2a] download colin27 mesh from github instead
2019-08-31 [7e8ad7a] fix .mch file header due to wrong history data structure
2019-08-24 [4dc1228] fix memory crash due to wrong output data length for plucker, havel & badouel ray-tracer when basisorder is 0
2019-08-20 [ca4d675] allow photons that exit into 0-label elements to be detected
2019-08-20 [26477c7] add gpu parameter specifier to make RGA happy
2019-07-26*[9e800e0] fix output detected photon information for SSE-MMCL and GPU-MMCL
2019-07-25 [a3b7714] fix maximum time gate rounding bug
2019-07-24 [eb109e0] return detected photon info in mmclabcl,print progress bar
2019-07-18 [7995941] compile on new mac
2019-07-18 [9f50a7f] hacky workaround to avoid convert_float error for -1 returned by vectorized isgreater on Intel GPU
2019-07-16 [ef0ef3f] use mmclab('gpuinfo') to query gpu devices
2019-07-16 [a45f5a1] undo the revert
2019-07-16 [cfe52b4] fix rng bug on mac
2019-07-16 [72ce3a2] fix RNG error for SSE MMC on windows - long is 32bit on windows
2019-07-16 [c242112] long is only 32bit on windows, fix incorrect mmc results
2019-07-15 [c8c1cb9] Merge branch 'master' into mmcx
2019-07-12 [04565c2] compile for mac with static gcc and gomp
2019-07-12 [afcfda1] mac opencl does not accept more than 8 constant inputs
2019-07-12 [57880f1] allow to compile on windows
2019-07-12 [1e5455d] changes to compile on mac
2019-07-12 [3621fc9] make mmcl compile on mac
2019-07-12 [167ec74] output oct file with correct name
2019-07-12 [95f65bf] disable dref demo as mmcl has not fully merged with master
2019-07-08 [9e622d5] fix index issue for branchless ray-tracer 0-basisorder
2019-07-05 [23f1159] merge with master
2019-07-04 [a78760b] fix normalization indexing bug
2019-07-03 [522e21b] add matlab scripts to create plots for the paper, paper ready to submit
2019-07-02 [7779235] change line color
2019-07-02 [9fc9d0a] revert the mua change made yesterday for dmmc, thanks to Shijie
2019-07-01 [5f58a5f] update benchmark 4, correct alignment in benchmark 1
2019-07-01 [d3ebb41] change prefix in mmclab printing
2019-07-01 [a36c408] update run benchmark script
2019-07-01 [e1e567f] update mmcl bench mmclab script
2019-07-01 [f5aeac9] add benchmark scripts for mmcl
2019-07-01 [c1ec5f1] group 1/mua to normalization
2019-06-30 [5a29aa0] fix double summation and oldidx bug in method=elem
2019-06-30*[4d9013f] mmclabcl is working
2019-06-29 [acba8db] save nii for non-grid ray-tracers
2019-06-29 [fe9c83d] add b2 run_mmc script
2019-06-29 [727e89d] change b2 mesh
2019-06-28 [e537021] further update benchmark script
2019-06-28*[5b7e840] benchmark script to run on different host
2019-06-28 [3fb9a9c] fix param priority from command line
2019-06-28 [5325e53] fix cl build error
2019-06-28 [4fd89e3] make dual-mode mmc again, remove unneeded registers
2019-06-28 [ad13d2a] revert code back to 03/28 version
2019-06-28 [fba4b95] update benchmark script
2019-06-27 [420c0d4] b3 test script
2019-06-27 [d2f193f] update script
2019-06-27 [8b7e3ff] save output to bin
2019-06-27 [827b30d] add mmcl benchmark master script
2019-06-27 [d87e999] reduce colin27 photon
2019-06-27 [7f1e3ee] add spherical_shell demo
2019-06-27 [b971de3] add DMMC paper figure 1 mmclab demo script
2019-06-27 [f1876ba] add dmmc example mesh file
2019-06-27 [422a9da] add skinvessel mmc and dmmc example
2019-06-27 [02f208a] add mmc2json script to convert mmclab cfg to json
2019-06-27 [f0ff69b] fix the ray-tracer after dref related changes in the master branch
2019-06-26 [454de4f] update code variant name
2019-06-26*[d81fd9a] dualmode mmc - support both SSE4+cpu (-G -1) and CPU, rename to mmcl
2019-06-26 [4c65a21] merge with the latest master branch
2019-06-26 [90d0d20] fix outputtype=fluence and wp output, fix #36
2019-06-26 [45d3711] make mmc functions compatible with mcx output
2019-05-24 [0494f23] 2nd attempt to fix the reflection when mirror bc is used
2019-05-24 [a7ac195] allow internal reflections when mirror bc is set
2019-05-16*[24e2bb0] use isreflect=2 for total absorption on outer surf, 3 for perfect mirror
2019-04-30 [a2dda44] allow point sources to use initial elements
2019-04-30 [cbdcb1e] avoid initial elem search in cone/arcsine source launch
2019-04-24 [caa0a65] fix output format in both dmmc and mmc mode
2019-04-24 [cb93d6c] remove mac compilation error
2019-04-22 [565ad68] fix bugs and finally get diffuse reflectance output to work
2019-04-21 [5b79f08] save diffuse reflectance on surface
2019-04-21 [44d1865] copy initial test from plucker to other 3 ray tracers
2019-04-21 [ed8dbdd] saving dref on surface, support saveref option, feature incomplete
2019-04-17 [f3623ff] minor update to function parameter type
2019-04-16 [492aa3e] restore the capability to save mch files
2019-04-02 [6407e74] update loadmch to support user defined output
2019-03-28 [5312319] use DO_NOT_SAVE flag to remove memory operations
2019-03-26 [22fe07f] remove unused variables
2019-03-26 [eeccd6d] fix a bug found by Shijie
2019-03-25 [fb51a41] fix the missing energy loss for the first step in new voxel
2019-03-24 [78622ef] merge with master from github
2019-03-24 [2508642] first step to make mmc cl kernel cuda compatible
2019-03-23 [0e0f96f] fix bug in writing compression
2019-03-23 [752753d] use atomic call to return raytet counts
2019-03-23 [3676918] disable buffer fanning in the kernel
2019-03-23 [3f52cb2] only write to memory when moving out of a voxel
2019-03-23 [370e91d] use macro to dynamically select dmmc vs elem-mmc
2019-03-21 [0b647e7] fix export data length
2019-03-21 [109910e] use a better random number to distribute the writing location
2019-03-21 [1b9d645] use --buffer to set copy of memory to reduce racing
2019-03-21 [2f2aec3] matching the branchless badouel algorithm in mmc, thanks to Shijie
2019-03-21 [0477ade] avoid mac compiler error
2019-03-21 [9f5eaf9] fix a critical bug for dmmc ray-tracer
2019-03-21 [c6c0721] disable volume saving if --save2pt is set to 0
2019-03-21 [bc67cc8] fix incorrect results on AMD devices
2019-03-21 [4ec96ed] update printed program name
2019-03-20 [5dd80a1] no need to convert char lookup in string
2019-03-20 [cfaf43b] merge with mmc v2019.3 master branch
2019-03-20 [4efb7b7] fix OpenCL-precision-induced ray-tracing accuracy issue in Branchless-badouel ray-tracer
2019-03-05 [ae6de41] update change log and README for v2019.3 release
2019-03-05 [10f72ea] support mc2 and nii output for DMMC
2019-03-04 [f403303] disable linking with iomp5 to avoid crash in older matlab
2019-03-01 [996f765] add USC 19.5 atlas example, Fig9a in TranYan2019(submitted)
2019-02-11 [8acc8d7] really reduce register count, fix DMMC output crash
2019-02-10*[61ef773] fix dmmc, 5x speed increase from normal mmc
2019-02-10 [2417416] fix infinite loop, thanks Shijie!
2019-02-10 [5ef4499] return total ray-tet intersection counts
2019-02-09 [7fab7dd] moving node,elem,type,facenb,normal,srcelem to constant mem
2019-02-09 [ec0e183] optimized based on vtune profiling on intel cpu
2019-02-09 [84e87e4] add xorshift128+ RNG, seed each thread by host RNG
2019-02-06 [fe76503] convert output weight to double
2019-01-31*[5c571cf] now can run on cuda and cpu
2019-01-30 [29ae3e1] need debugging, but very close to bug free for the ray-tracing
2019-01-27 [75fbcde] mmcx now can compile, no error
2019-01-18 [53fda23] fix mmclab crash due to racing in multi-thread, similar to mcx issue #60
2019-01-17 [2f61bfe] a very rough draft of the cl kernel, converted ray-tracer from SIMD to float3
2019-01-14*[43aae8f] sync internal mmcx branch with master, mmcx branch was started in 2018

== MMC 1.4.8-2 (v2019.4, Pork Rinds - beta, update 2), Qianqian Fang <q.fang at neu.edu> ==

2019-04-24 [8270b96] fix #35 - incorect mch file header in photon-sharing implementation
Expand Down
90 changes: 52 additions & 38 deletions README.txt
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
===============================================================================
= Mesh-based Monte Carlo (MMC) =
= Multi-threaded Edition with SSE4 =
= Supporting both OpenCL and Multi-threading with SSE4 =
===============================================================================

Author: Qianqian Fang <q.fang at neu.edu>
License: GNU General Public License version 3 (GPL v3), see License.txt
Version: 1.4.8-2 (v2019.4, Pork Rinds - beta, update 2)
Version: 1.9 (v2019.10, Moon Cake - alpha)
URL: http://mcx.space/mmc

-------------------------------------------------------------------------------
Expand All @@ -27,40 +27,40 @@ VIII.Reference

O. What's New

In MMC v2019.4 (1.4.8-2), the follow feature was added
MMC v2019.10 (1.9) is a major update to MMC. For the first time, MMC adds
GPU support via the newly implemented OpenCL version. The released package
simultaneously supports CPU-only multi-threading with SSE4 (standard MMC)
and OpenCL-based MMC on a wide variety of CPU/GPU devices across vendors.
Using up-to-date GPU hardware, the MMC simulation speed was increased by
100x to 400x compared to single-threaded SSE4-based MMC simulation. The detailed
description of the GPU accelerated MMC can be found in the below in-press
paper [Fang2019] and its preprint online.

* Support -X/--saveref to save diffuse reflectance/transmittance on mesh surface
* Speed up DMMC memory operations
One can choose between the SSE4 and OpenCL based simulation modes using
the -G or cfg.gpuid input options. A device ID of -1 enables SSE4 CPU based
MMC, and a number 1 or above chooses the supported OpenCL device (using
"mmc -L" or "mmclab('gpuinfo')" to list).

It also fixed the below critical bugs:
A detailed (long) list of updates can be found in the ChangeLog.txt or
the Github commit history: https://github.com/fangq/mmc/commits/master

* fix #35 - incorect mch file header in photon-sharing implementation
* restore the capability to save mch files without needing --saveexit 1
* for Win64, use a newer version of libgomp-1.dll to run mmclab without dependency errors
To highlight a few most important updates:

* Supported GPU using OpenCL in both binary and mmclab
* GPU MMC (or MMCL) had been rigirously validated across a range of benchmarks
* Charactrized the speed improvement of MMCL simulations over standard MMC
* Created "mmc" and "octave-mmclab" official Fedora packages and disseminate via Fedora repositories
* Implemented xorshift128+ RNG unit and used as default for both CPU/GPU MMC
* Fixed a list of bugs in both SSE4/OpenCL MMC
* Created 6 standard benchmarks (B1:cube60, B1D:d-cube60, B2:sphshells, B2D:d-sphshells, B3:colin27, B4:skin-vessel) for comparisons

Also, in MMC v2019.3 (1.4.8), we added a list of major new additions, including
Please file bug reports to https://github.com/fangq/mmc/issues

* Add 2 built-in complex domain examples - USC_19-5 brain atlas and mcxyz skin-vessel benchmark
* Initial support of "photon sharing" - a fast approach to simultaneouly simulate multiple pattern src/det, as detailed in our Photoncs West 2019 talk by Ruoyang Yao/Shijie Yan [Yao&Yan2019]
* Dual-grid MMC (DMMC) paper published [Yan2019], enabled by "-M G" or cfg.method='grid'
* Add clang compiler support, nightly build compilation script, colored command line output, and more
Reference:

In addition, we also fixed a number of critical bugs, such as

* fix mmclab gpuinfo output crash using multiple GPUs
* disable linking to Intel OMP library (libiomp5) to avoid MATLAB 2016-2017 crash
* fix a bug for doubling thread number every call to mmc, thanks to Shijie
* fix mmclab crash due to photo sharing update

'''[Yan2019]''' Shijie Yan, Anh Phong Tran, Qianqian Fang*, "A dual-grid mesh-based\
Monte Carlo algorithm for efficient photon transport simulations in complex 3-D media,"\
J. of Biomedical Optics, 24(2), 020503 (2019). URL: https://doi.org/10.1117/1.JBO.24.2.020503

'''[Yao&Yan2019]''' Ruoyang Yao, Shijie Yan, Xavier Intes, Qianqian Fang, \
"Accelerating Monte Carlo forward model with structured light illumination via 'photon sharing'," \
Photonics West 2019, paper#10874-11, San Francisco, CA, USA. \
[https://www.spiedigitallibrary.org/conference-presentations/10874/108740B/Accelerating-Monte-Carlo-forward-model-with-structured-light-illumination-via/10.1117/12.2510291?SSO=1 Full presentation for our invited talk]
'''[Fang2019]''' Qianqian Fang* and Shijie Yan, "GPU-accelerated mesh-based \
Monte Carlo photon transport simulations," J. of Biomedical Optics, in press, 2019. \
Preprint URL: https://www.biorxiv.org/content/10.1101/815977v1

-------------------------------------------------------------------------------

Expand All @@ -75,9 +75,10 @@ mesh to represent curved boundaries and complex structures, making it
even more accurate, flexible, and memory efficient. MMC uses the
state-of-the-art ray-tracing techniques to simulate photon propagation in
a mesh space. It has been extensively optimized for excellent computational
efficiency and portability. MMC currently supports both multi-threaded
parallel computing and Single Instruction Multiple Data (SIMD) parallism
to maximize performance on a multi-core processor.
efficiency and portability. MMC currently supports multi-threaded
parallel computing via OpenMP, Single Instruction Multiple Data (SIMD)
parallism via SSE and, starting from v2019.10, OpenCL to support a wide
range of CPUs/GPUs from nearly all vendors.

To run an MMC simulation, one has to prepare an FE mesh first to
discretize the problem domain. Image-based 3D mesh generation has been
Expand All @@ -92,6 +93,13 @@ or even thousand-fold acceleration in speed similar to what we
have observed in our GPU-accelerated Monte Carlo software (Monte Carlo
eXtreme, or MCX [2]).

The most relevant publication describing this work is the GPU-accelerated
MMC paper:

Qianqian Fang and Shijie Yan, "GPU-accelerated mesh-based Monte Carlo
photon transport simulations," J. of Biomedical Optics, in press, 2019.
Preprint URL: https://www.biorxiv.org/content/10.1101/815977v1

Please keep in mind that MMC is only a partial implementation of the
general Mesh-based Monte Carlo Method (MMCM). The limitations and issues
you observed in the current software will likely be removed in the future
Expand Down Expand Up @@ -195,16 +203,16 @@ and type

make

this will create a fully optimized, multi-threaded and SSE4 enabled
mmc executable, located under the mmc/src/bin/ folder.
this will create a fully optimized OpenCL based mmc executable,
located under the mmc/src/bin/ folder.

Other compilation options include

make ssemath # this uses SSE4 for both vector operations and math functions
make omp # this compiles a multi-threaded binary using OpenMP
make release # create a single-threaded optimized binary
make prof # this makes a binary to produce profiling info for gprof
make sse # this uses SSE4 for all vector operations (dot, cross), implies omp
make ssemath # this uses SSE4 for both vector operations and math functions

if you want to generate a portable binary that does not require external
library files, you may use (only works for Linux and Windows with gcc)
Expand Down Expand Up @@ -290,7 +298,7 @@ same direction. Otherwise, MMC will give incorrect results.
The full command line options of MMC include the following:
<pre>
###############################################################################
# Mesh-based Monte Carlo (MMC) #
# Mesh-based Monte Carlo (MMC) - OpenCL #
# Copyright (c) 2010-2019 Qianqian Fang <q.fang at neu.edu> #
# http://mcx.space/#mmc #
# #
Expand All @@ -299,7 +307,7 @@ The full command line options of MMC include the following:
# #
# Research funded by NIH/NIGMS grant R01-GM114365 #
###############################################################################
$Rev::8270b9$2019.4 $Date::2019-04-24 14:18:58 -04$ by $Author::Qianqian Fang $
$Rev::57e5d6$2019.10$Date::Qianqian Fang $ by $Author::Qianqian Fang $
###############################################################################

usage: mmc <param1> <param2> ...
Expand All @@ -321,7 +329,7 @@ where possible parameters include (the first item in [] is the default value)
to calculate the mua/mus Jacobian matrices
-P [0|int] (--replaydet) replay only the detected photons from a given
detector (det ID starts from 1), use with -E
-M [H|PHBSG] (--method) choose ray-tracing algorithm (only use 1 letter)
-M [G|SG] (--method) choose ray-tracing algorithm (only use 1 letter)
P - Plucker-coordinate ray-tracing algorithm
H - Havel's SSE4 ray-tracing algorithm
B - partial Badouel's method (used by TIM-OS)
Expand All @@ -330,6 +338,11 @@ where possible parameters include (the first item in [] is the default value)
-e [1e-6|float](--minenergy) minimum energy level to trigger Russian roulette
-V [0|1] (--specular) 1 source located in the background,0 inside mesh
-k [1|0] (--voidtime) when src is outside, 1 enables timer inside void
-A [0|int] (--autopilot) auto thread config:1 enable;0 disable
-G [0|int] (--gpu) specify which GPU to use, list GPU by -L; 0 auto
or
-G '1101' (--gpu) using multiple devices (1 enable, 0 disable)
-W '50,30,20' (--workload) workload for active devices; normalized by sum
--atomic [1|0] 1 use atomic operations, 0 use non-atomic ones

== Output options ==
Expand All @@ -338,6 +351,7 @@ where possible parameters include (the first item in [] is the default value)
J - Jacobian, L - weighted path length, P -
weighted scattering count (J,L,P: replay mode)
-d [0|1] (--savedet) 1 to save photon info at detectors,0 not to save
-H [1000000] (--maxdetphoton) max number of detected photons
-S [1|0] (--save2pt) 1 to save the fluence field, 0 do not save
-x [0|1] (--saveexit) 1 to save photon exit positions and directions
setting -x to 1 also implies setting '-d' to 1
Expand Down
Loading

0 comments on commit 4d7d94f

Please sign in to comment.