Background
During testing with laser-cholera 0.12.1 we discovered that the r-mosaic Python environment now contains three different OpenMP runtimes simultaneously:
| Library |
Source |
How it gets loaded |
libomp |
Clang/LLVM |
data.table R package (macOS ARM) |
libiomp5 |
Intel KMP |
numba's default omp threading layer |
libgomp |
GNU |
scipy |
This combination was causing a SIGSEGV crash (__kmp_suspend_initialize_thread) when lc$run_model() was first called in the main R process after parallel PSOCK workers had terminated. KMP_DUPLICATE_LIB_OK=TRUE was already set in zzz.R to handle the libomp conflict between data.table and PyTorch, but it does not prevent GNU + Intel cross-runtime conflicts.
Current fix in MOSAIC-pkg
zzz.R now sets NUMBA_THREADING_LAYER=workqueue at package load (v0.19.13). This switches numba from its default omp backend (which loads libiomp5) to its own built-in thread pool, removing libiomp5 from the equation entirely. Only libgomp and libomp remain, which coexist without conflict. Performance impact is negligible for laser-cholera's SEIR kernels.
Why this is worth flagging
The fix works but the underlying situation is worth being aware of. Three OpenMP runtimes coexisting is inherently fragile — future package updates could reintroduce conflicts in unexpected ways. The crash manifested as a hard R session crash with no R-level error message, making it difficult to diagnose without reading the macOS crash reporter.
Recommendations for consideration
Docker image (mosaic-acr-workers): worth verifying the worker image does not have the same latent conflict. Linux containers will not have libomp from data.table, but libgomp vs libiomp5 may still coexist. Setting NUMBA_THREADING_LAYER=workqueue in the Docker entrypoint would make workers consistent with the local fix — this could be bundled with the bokeh update in #52.
Python environment: ensuring a consistent BLAS/OpenMP stack across all packages in environment.yml would be the cleanest long-term solution, though this is non-trivial given how pip wheels bundle their own libraries.
Relevant commit: 2846de5 (v0.19.13)
Relates to #30, #52
Background
During testing with laser-cholera 0.12.1 we discovered that the
r-mosaicPython environment now contains three different OpenMP runtimes simultaneously:libomplibiomp5ompthreading layerlibgompThis combination was causing a SIGSEGV crash (
__kmp_suspend_initialize_thread) whenlc$run_model()was first called in the main R process after parallel PSOCK workers had terminated.KMP_DUPLICATE_LIB_OK=TRUEwas already set inzzz.Rto handle thelibompconflict between data.table and PyTorch, but it does not prevent GNU + Intel cross-runtime conflicts.Current fix in MOSAIC-pkg
zzz.Rnow setsNUMBA_THREADING_LAYER=workqueueat package load (v0.19.13). This switches numba from its defaultompbackend (which loadslibiomp5) to its own built-in thread pool, removinglibiomp5from the equation entirely. Onlylibgompandlibompremain, which coexist without conflict. Performance impact is negligible for laser-cholera's SEIR kernels.Why this is worth flagging
The fix works but the underlying situation is worth being aware of. Three OpenMP runtimes coexisting is inherently fragile — future package updates could reintroduce conflicts in unexpected ways. The crash manifested as a hard R session crash with no R-level error message, making it difficult to diagnose without reading the macOS crash reporter.
Recommendations for consideration
Docker image (
mosaic-acr-workers): worth verifying the worker image does not have the same latent conflict. Linux containers will not havelibompfrom data.table, butlibgompvslibiomp5may still coexist. SettingNUMBA_THREADING_LAYER=workqueuein the Docker entrypoint would make workers consistent with the local fix — this could be bundled with the bokeh update in #52.Python environment: ensuring a consistent BLAS/OpenMP stack across all packages in
environment.ymlwould be the cleanest long-term solution, though this is non-trivial given how pip wheels bundle their own libraries.Relevant commit:
2846de5(v0.19.13)Relates to #30, #52