Skip to content

Memory and performance improvements#358

Merged
daleroberts merged 1 commit into1.4.0from
dr-344
Apr 22, 2026
Merged

Memory and performance improvements#358
daleroberts merged 1 commit into1.4.0from
dr-344

Conversation

@daleroberts
Copy link
Copy Markdown
Member

The core of the change is a rework of the matrix_2d class to use packed and symmetric storage modes, a faster 3x3 copyelements, LAPACK bindings for the packed routines (dpptrf, dpptri, dspmv, dsymm, dpotrs), and bounds-check asserts on every element accessor.

There are also a way to tune threading with --max-threads as testing has shown that it's not always optimal to use all available threads (e.g., on NCI).

Optimisations have also been made for --staged mode. For example, matrix allocator path now supports attach-to-mmap and calloc-backed buffers. Further, there is now a --stage-path option so the path location of where the mmap files are stored can be set. This allows, for example, to use the /iointensive path on NCI.

Finally, an optional Levenberg-Marquardt / trust-region solver and an Anderson-acceleration stage has been been included that mixes the last few iterates to speed convergence. This needs to be tested more and configured by various flags: --lm-enabled, --lm-lambda-init, --lm-eta-good, --lm-eta-accept, --lm-gamma-up, --lm-gamma-down, --lm-max-rejects, --anderson-acceleration, --aa-depth.

@daleroberts daleroberts requested a review from umma-zannat April 21, 2026 03:56
@daleroberts daleroberts force-pushed the dr-344 branch 2 times, most recently from 28c9f2e to 3b6fdd9 Compare April 21, 2026 04:37
- Matrix class: packed/symmetric storage, attach-to-mmap, calloc allocator
- LM solver + Anderson acceleration CLI
- Static build fixes (glibc 2.34+ pthread)
- Matrix test suite
@daleroberts daleroberts merged commit 6008a08 into 1.4.0 Apr 22, 2026
16 checks passed
@daleroberts daleroberts deleted the dr-344 branch April 22, 2026 00:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant