⬜⬜⬜⬜⬜
🟩⬜🟩⬜🟩
⬜⬜⬜⬜⬜
🟩⬜🟩⬜🟩
Speed Up OpenFOAM Framework with PETSc Library (experimental)
Table of Contents
There are two major distributions of OpenFOAM1: OpenFOAM.com (ESI-OpenCFD) and OpenFOAM.org (OpenFOAM Foundation). In order to avoid compatibility issues, we need to compare the two distributions and choose the one with the more stable API.
OpenFOAM is compiled using a customized tool named wmake, which is a further wrapper around the compilation tool Make. It can be said that wmake has a narrower scope of use, and thus is less adaptable to mainstream toolchains. Therefore, it is cumbersome to analyze OpenFOAM API compatibility at the source code level, so we currently use tutorial compatibility to reflect OpenFOAM API compatibility.
Tutorial compatibility is based on the number of lines, we use
Figure 1.1 and 1.2 represent the distribution of tutorial compatibility between different versions of OpenFOAM, with the horizontal coordinate representing the different versions of OpenFOAM and the vertical coordinate representing the tutorial compatibility2. It can be seen that the tutorial compatibility of OpenFOAM.com is better than that of OpenFOAM.org, and therefore we use OpenFOAM.com to avoid compatibility issues as much as possible.
Figure 1.1: OpenFOAM.com Figure 1.2: OpenFOAM.orgTL;DR: Our solution for managing PETSc installations34
Some matrices contain complex numbers, and the PETSc library needs to modify configuration options to conditionally compile a version that supports real or complex numbers; at the same time, local debugging needs to disable the optimization flag to compile a version that contains debuggable code, while HPC needs to enable the optimization flag to speed up the target code.
The installation management in the PETSc official document is relatively simple, and each switch needs to specify a number of environment variables, which makes the operation more cumbersome; at the same time, the PETSc application code compilation relies too much on Make, and basically all the compile flags and search paths are stored in the form of variables in Make, which is not conducive to switching to CMake and other standard and modern code build systems, and thus we need to implement function jumps and other functions in the local IDE.
We use Python to merge the installation management steps in the official PETSc documentation, and by presetting a number of configuration options, we realize that a single command completes the installation, configuration and other cumbersome operations, which reduces the burden on the mind.
$ ipetsc build \
--arch advance-real-optimize \
--arch advance-complex-optimize
$ source etc/bashrc.advance-real-optimize.sh
$ echo $PETSC_ARCH
advance-real-optimize
$ source etc/bashrc.advance-complex-optimize.sh
$ echo $PETSC_ARCH
advance-complex-optimize
In addition to the preset configuration options, users can also add customized configuration options to increase the flexibility of our solution. Finally, we refer to OpenFOAM's use of etc/bashrc to store compile flags and search paths as environment variables, thus allowing us to switch between different code build systems at will.
$ BASE=basic-real-optimize
$ ipetsc build \
--arch $BASE:$BASE-amgx \
-- \
--download-triangle=1 \
--download-amgx=1 \
--download-amgx-cmake-arguments="-DMPI_CXX_COMPILE_DEFINITIONS=-lmpi" \
--with-cuda-dir=/usr/local/cuda/
Moreover, it is possible to add extra features to our solution, such as:
- Visualizing sparse matrices to visually display the non-zero element distribution of the matrices, thus narrowing down the range of hyper-parameters of the solving algorithm;
- Analyzing LogView and thus quantifying the performance of the PETSc application code in order to compare different hyper-parameters of the solving algorithm;
- Generating PETSc application code templates, or configuration files for the code build systems (CMakeLists.txt, etc.).
$ ipetsc cvt \
--input data/1/A.mtx \
--output cache/data/1/A.npz \
--type real --base 0
$ ipetsc spy \
--input data/1/A.mtx \
--output image/1.png \
--size 1024x1024
$ ipetsc new \
--path src/prob_real/ \
--update # Update cmake, makefile, vscode
$ ls --all src/prob_real/
. .. CMakeLists.txt main.c Makefile .vscode
PETSc4FOAM is a library that plug-in PETSc into the OpenFOAM framework5, OpenFOAM.com implements a version with limitations, such as not supporting certain boundary conditions6.
We have provided an example installation of PETSc4FOAM using Docker with the configuration file petsc4foam.dockerfile, we have not provided a test case and do not guarantee that it will still work now.
We use OpenFOAM-v2106 to test the performance of the PETSc4FOAM implementation. The test set comes from tutorials that can run locally for less than 300 seconds, and is taken in parallel if it can be, otherwise it defaults to serial. The PETSc solving algorithm hyper-parameters are left empty, i.e., the default values are selected. The collapsed table below shows the results of the test set.
Test Results
application | tutorial | parallel | time_foam | time_petsc | petsc/foam |
---|---|---|---|---|---|
liquidFilmFoam | finiteArea/liquidFilmFoam/cylinder | 4 | 54.5962 | 74.5738 | 1.36591 |
PDRFoam | combustion/PDRFoam/flamePropagationWithObstacles | 1 | 126.312 | 208.068 | 1.64726 |
fireFoam | combustion/fireFoam/LES/flameSpreadWaterSuppressionPanel | 1 | 62.0199 | 82.2087 | 1.32552 |
fireFoam | combustion/fireFoam/LES/simplePMMApanel | 1 | 4.64767 | 4.54882 | 0.978731 |
compressibleInterDyMFoam | multiphase/compressibleInterDyMFoam/laminar/sloshingTank2D | 1 | 42.1959 | 38.6318 | 0.915535 |
reactingTwoPhaseEulerFoam | multiphase/reactingTwoPhaseEulerFoam/laminar/injection | 1 | 40.289 | 49.8847 | 1.23817 |
reactingTwoPhaseEulerFoam | multiphase/reactingTwoPhaseEulerFoam/laminar/steamInjection | 1 | 274.909 | 459.745 | 1.67235 |
icoReactingMultiphaseInterFoam | multiphase/icoReactingMultiPhaseInterFoam/evaporationMultiComponent | 1 | 232.445 | 751.688 | 3.23384 |
icoReactingMultiphaseInterFoam | multiphase/icoReactingMultiPhaseInterFoam/inertMultiphaseMultiComponent | 1 | 68.1879 | 104.234 | 1.52862 |
twoPhaseEulerFoam | multiphase/twoPhaseEulerFoam/laminar/injection | 1 | 35.6297 | 46.6352 | 1.30889 |
interIsoFoam | multiphase/interIsoFoam/notchedDiscInSolidBodyRotation | 1 | 7.07395 | 7.06355 | 0.998529 |
interIsoFoam | multiphase/interIsoFoam/weirOverflow | 1 | 13.9652 | 404.528 | 28.9669 |
interIsoFoam | multiphase/interIsoFoam/discInReversedVortexFlow | 1 | 83.1104 | 82.4318 | 0.991834 |
interIsoFoam | multiphase/interIsoFoam/discInConstantFlow | 1 | 0.796971 | 0.785794 | 0.985976 |
interIsoFoam | multiphase/interIsoFoam/discInConstantFlowCyclicBCs | 1 | 0.72161 | 0.719718 | 0.997378 |
interCondensatingEvaporatingFoam | multiphase/interCondensatingEvaporatingFoam/condensatingVessel | 1 | 75.4212 | 207.477 | 2.75091 |
multiphaseInterFoam | multiphase/multiphaseInterFoam/laminar/damBreak4phase | 1 | 32.8729 | 64.1778 | 1.9523 |
interFoam | multiphase/interFoam/laminar/damBreakPermeable | 1 | 2.37643 | 57.4663 | 24.1818 |
interFoam | multiphase/interFoam/laminar/testTubeMixer | 1 | 17.4256 | 45.6418 | 2.61924 |
interFoam | multiphase/interFoam/laminar/damBreak/damBreak | 1 | 3.30486 | 55.41 | 16.7662 |
interFoam | multiphase/interFoam/RAS/damBreak/damBreak | 1 | 2.25597 | 42.0251 | 18.6284 |
compressibleInterFoam | multiphase/compressibleInterFoam/laminar/depthCharge2D | 1 | 38.0536 | 39.2199 | 1.03065 |
compressibleMultiphaseInterFoam | multiphase/compressibleMultiphaseInterFoam/laminar/damBreak4phase | 1 | 46.114 | 48.1891 | 1.045 |
twoLiquidMixingFoam | multiphase/twoLiquidMixingFoam/lockExchange | 1 | 21.9684 | 343.478 | 15.6351 |
compressibleInterIsoFoam | multiphase/compressibleInterIsoFoam/laminar/depthCharge2D | 1 | 67.6197 | 57.8889 | 0.856095 |
multiphaseEulerFoam | multiphase/multiphaseEulerFoam/damBreak4phase | 1 | 170.796 | 174.722 | 1.02299 |
multiphaseEulerFoam | multiphase/multiphaseEulerFoam/bubbleColumn | 1 | 135.974 | 202.502 | 1.48927 |
potentialFreeSurfaceFoam | multiphase/potentialFreeSurfaceFoam/oscillatingBox | 1 | 18.892 | 49.3371 | 2.61154 |
rhoSimpleFoam | compressible/rhoSimpleFoam/angledDuctExplicitFixedCoeff | 1 | 19.6578 | 405.014 | 20.6032 |
rhoCentralFoam | compressible/rhoCentralFoam/shockTube | 1 | 0.0942714 | 0.303647 | 3.22099 |
rhoCentralFoam | compressible/rhoCentralFoam/LadenburgJet60psi | 1 | 23.3061 | 28.3209 | 1.21517 |
rhoPimpleFoam | compressible/rhoPimpleFoam/laminar/sineWaveDamping | 1 | 61.836 | 77.0334 | 1.24577 |
rhoPimpleFoam | compressible/rhoPimpleFoam/RAS/angledDuctLTS | 1 | 14.9777 | 27.1883 | 1.81526 |
rhoPimpleFoam | compressible/rhoPimpleFoam/RAS/mixerVessel2D | 1 | 8.79051 | 14.2535 | 1.62146 |
sonicFoam | compressible/sonicFoam/laminar/shockTube | 1 | 1.20951 | 1.77627 | 1.46859 |
coalChemistryFoam | lagrangian/coalChemistryFoam/simplifiedSiwek | 4 | 18.605 | 20.5423 | 1.10412 |
reactingParcelFoam | lagrangian/reactingParcelFoam/verticalChannelLTS | 1 | 169.547 | 273.891 | 1.61543 |
reactingParcelFoam | lagrangian/reactingParcelFoam/recycleParticles | 2 | 2.64601 | 3.0002 | 1.13386 |
reactingParcelFoam | lagrangian/reactingParcelFoam/parcelInBox | 1 | 0.945285 | 1.46008 | 1.54459 |
reactingParcelFoam | lagrangian/reactingParcelFoam/filter | 4 | 17.2056 | 22.1532 | 1.28756 |
simpleReactingParcelFoam | lagrangian/simpleReactingParcelFoam/verticalChannel | 4 | 79.7074 | 139.06 | 1.74463 |
shallowWaterFoam | incompressible/shallowWaterFoam/squareBump | 1 | 1.68878 | 2.65668 | 1.57314 |
pisoFoam | incompressible/pisoFoam/RAS/cavity | 4 | 9.86059 | 6.46813 | 0.655958 |
icoFoam | incompressible/icoFoam/cavityMappingTest | 4 | 0.512954 | 0.536315 | 1.04554 |
icoFoam | incompressible/icoFoam/elbow | 1 | 0.440706 | 11.3589 | 25.7743 |
simpleFoam | incompressible/simpleFoam/backwardFacingStep2D | 1 | 45.1615 | 1106.76 | 24.5067 |
simpleFoam | incompressible/simpleFoam/mixerVessel2D | 1 | 1.59308 | 30.9162 | 19.4065 |
simpleFoam | incompressible/simpleFoam/simpleCar | 1 | 2.63694 | 155.473 | 58.9596 |
SRFPimpleFoam | incompressible/SRFPimpleFoam/rotor2D | 1 | 52.5763 | 1445 | 27.4838 |
pimpleFoam | incompressible/pimpleFoam/RAS/TJunctionFan | 1 | 23.7639 | 326.105 | 13.7227 |
solidDisplacementFoam | stressAnalysis/solidDisplacementFoam/plateHole | 1 | 0.0917768 | 0.490052 | 5.33961 |
buoyantBoussinesqSimpleFoam | heatTransfer/buoyantBoussinesqSimpleFoam/hotRoom | 1 | 5.66248 | 20.2273 | 3.57216 |
buoyantPimpleFoam | heatTransfer/buoyantPimpleFoam/hotRoom | 1 | 9.91152 | 23.6746 | 2.38859 |
buoyantPimpleFoam | heatTransfer/buoyantPimpleFoam/thermocoupleTestCase | 1 | 44.9003 | 71.4351 | 1.59097 |
buoyantBoussinesqPimpleFoam | heatTransfer/buoyantBoussinesqPimpleFoam/hotRoom | 1 | 6.62563 | 200.328 | 30.2353 |
dsmcFoam | discreteMethods/dsmcFoam/freeSpacePeriodic | 1 | 33.9859 | 32.2858 | 0.949977 |
scalarTransportFoam | verificationAndValidation/schemes/divergenceExample | 1 | 4.3484 | 11.4358 | 2.62989 |
potentialFoam | basic/potentialFoam/pitzDaily | 1 | 0.104949 | 0.701211 | 6.68143 |
potentialFoam | basic/potentialFoam/cylinder | 1 | 0.0411849 | 0.584342 | 14.1882 |
laplacianFoam | basic/laplacianFoam/flange | 4 | 1.10576 | 1.85954 | 1.68169 |
dnsFoam | DNS/dnsFoam/boxTurb16 | 1 | 3.31527 | 96.3216 | 29.054 |
We can see that in most cases PETSc with the default hyper-parameters does not perform as well as the OpenFOAM native implementation, and the following results are exceptional without excluding errors.
application | tutorial | parallel | time_foam | time_petsc | petsc/foam |
---|---|---|---|---|---|
pisoFoam | incompressible/pisoFoam/RAS/cavity | 4 | 9.86059 | 6.46813 | 0.655958 |
compressibleInterIsoFoam | multiphase/compressibleInterIsoFoam/laminar/depthCharge2D | 1 | 67.6197 | 57.8889 | 0.856095 |
compressibleInterDyMFoam | multiphase/compressibleInterDyMFoam/laminar/sloshingTank2D | 1 | 42.1959 | 38.6318 | 0.915535 |
dsmcFoam | discreteMethods/dsmcFoam/freeSpacePeriodic | 1 | 33.9859 | 32.2858 | 0.949977 |
fireFoam | combustion/fireFoam/LES/simplePMMApanel | 1 | 4.64767 | 4.54882 | 0.978731 |
interIsoFoam | multiphase/interIsoFoam/discInConstantFlow | 1 | 0.796971 | 0.785794 | 0.985976 |
interIsoFoam | multiphase/interIsoFoam/discInReversedVortexFlow | 1 | 83.1104 | 82.4318 | 0.991834 |
interIsoFoam | multiphase/interIsoFoam/discInConstantFlowCyclicBCs | 1 | 0.72161 | 0.719718 | 0.997378 |
interIsoFoam | multiphase/interIsoFoam/notchedDiscInSolidBodyRotation | 1 | 7.07395 | 7.06355 | 0.998529 |
Next, we demonstrate the feasibility of tuning PETSc4FOAM via hyper-parameters by optimizing PETSc hyper-parameters to shorten the program runtime, we choose tutorial multiphaseInterFoam-mixerVessel2D, and here are the tuning result:
- Native tutorial:
$48.04 \pm 0.45$ seconds - PETSc solver with default hyper-parameters:
$443.11 \pm 5.91$ seconds - PETSc solver with tuned hyper-parameters:
$48.03 \pm 0.42$ seconds
Tuned Hyper-parameters (fvSolution)
solvers {
"alpha.*" {
nAlphaCorr 4;
nAlphaSubCycles 4;
cAlpha 1;
}
"pcorr.*" {
solver petsc;
tolerance 1e-10;
relTol 0;
petsc {
options {
ksp_type bicg;
pc_type bjacobi;
sub_pc_type ilu;
}
use_petsc_residual_norm false;
monitor_foam_residual_norm false;
caching {
matrix {
update always;
}
preconditioner {
update always;
}
}
}
}
p_rgh {
solver petsc;
tolerance 1e-07;
relTol 0.05;
petsc {
options {
ksp_type cg;
pc_type cholesky;
sub_pc_type ilu;
}
use_petsc_residual_norm false;
monitor_foam_residual_norm false;
caching {
matrix {
update always;
}
preconditioner {
update always;
}
}
}
}
p_rghFinal {
solver petsc;
tolerance 1e-07;
relTol 0;
petsc {
options {
ksp_type bicg;
pc_type bjacobi;
sub_pc_type ilu;
}
use_petsc_residual_norm false;
monitor_foam_residual_norm false;
caching {
matrix {
update always;
}
preconditioner {
update always;
}
}
}
}
"(U|T).*" {
solver petsc;
tolerance 1e-08;
relTol 0;
petsc {
options {
ksp_type cg;
pc_type bjacobi;
sub_pc_type ilu;
}
use_petsc_residual_norm false;
monitor_foam_residual_norm false;
caching {
matrix {
update always;
}
preconditioner {
update always;
}
}
}
}
}
To optimize the hyper-parameters, we find the sparse matrix dataset SuiteSparse Matrix Collection7, but upon visualization, we find that many of the sparse matrices are from non-CFD domains and could not be solved as linear systems. As in Figure 5.1.1, certain sparse matrices are visualized, with black representing zero elements and white representing non-zero elements.
Figure 5.1.1: Sparsity Pattern of Certain Matrices in SuiteSparse Matrix CollectionThe available training dataset is too small, we come up with an idea to dump sparse matrices from the OpenFOAM tutorial for constructing the training dataset. Therefore, we plug-in OpenFOAM to implement this idea. We have implemented the function to dump OpenFOAM lduMatrix to matrix market format, which does not yet take into account of certain boundary conditions, but is sufficient as a training dataset. Figure 5.1.2 shows the mixing elbow case that comes with the icoFoam solver, and Figure 5.1.3 shows a visualization of the dumped
Footnotes
-
https://www.cfd-online.com/Forums/openfoam/197150-openfoam-com-versus-openfoam-org-version-use.html ↩
-
https://seaborn.pydata.org/generated/seaborn.boxenplot.html ↩
-
https://www.semanticscholar.org/paper/PETSc4FOAM%3A-a-library-to-plug-in-PETSc-into-the-Bn%C3%A0-Spisso/0234a490ba9a3647a5ed4f35bee9a70f07cb2e49 ↩