Skip to content

Assorted optimizations for sparse dense matrix multiplication#666

Merged
dkarrasch merged 10 commits intoJuliaSparse:mainfrom
yuyichao:yyc/spmul
Apr 20, 2026
Merged

Assorted optimizations for sparse dense matrix multiplication#666
dkarrasch merged 10 commits intoJuliaSparse:mainfrom
yuyichao:yyc/spmul

Conversation

@yuyichao
Copy link
Copy Markdown
Contributor

@yuyichao yuyichao commented Jan 1, 2026

The change is mainly based on testing of the _spmul!(C::StridedMatrix, X::DenseMatrixUnion, A::SparseMatrixCSCUnion2, α::Number, β::Number) function which is also the function used for my performance numbers below. I've then also applied the improvements to other similar functions though the performance for some of them may not be as big (since not all functions touched are amendable to vectorization).

The main improvement is to hoist matrix size and pointer access out of the loop to work around JuliaLang/julia#60409 . This change have as much as 2x performance impact for complex numbers (it actually be even more on armv8.3-a (i.e. including all apple processors) and above by better triggering LLVM's complex number multiplication pattern matching with llvm/llvm-project#173818).

Adding muladd is the second most important change which affect mostly complex number and bigfloat since the cost of operations saved is more significant compared to the bare memory access.

And then there are other minor tweaks that are mostly useful for small matrices (~20% impact for ~10x10 matrix). These were included mainly because I was working on another optimization which may not work well for small cases. I then make these optimizations to the small matrix cases so that I can make a fair comparison for the effect caused by the other change. I've still not done with testing the other change yet but these small fixes are ready so I've included them here.

@codecov
Copy link
Copy Markdown

codecov Bot commented Jan 1, 2026

Codecov Report

❌ Patch coverage is 95.68966% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.42%. Comparing base (7f8c2c6) to head (939568b).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
src/linalg.jl 95.28% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #666      +/-   ##
==========================================
+ Coverage   84.36%   84.42%   +0.05%     
==========================================
  Files          13       13              
  Lines        9346     9400      +54     
==========================================
+ Hits         7885     7936      +51     
- Misses       1461     1464       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ViralBShah
Copy link
Copy Markdown
Member

Is it possible to update tests to increase the coverage?

@yuyichao
Copy link
Copy Markdown
Contributor Author

yuyichao commented Jan 5, 2026

I’ve added more direct test for both the multiplications and the error checking.

Comment thread src/linalg.jl
@ViralBShah
Copy link
Copy Markdown
Member

ViralBShah commented Jan 6, 2026

Let's give this a couple more days and merge.

Comment thread src/linalg.jl Outdated
Comment thread src/sparsevector.jl
@dkarrasch dkarrasch added backport 1.13 Change should be backported to release-1.13 backport 1.12 Change should be backported to release-1.12 labels Apr 20, 2026
@dkarrasch dkarrasch merged commit 99a103b into JuliaSparse:main Apr 20, 2026
12 checks passed
@dkarrasch dkarrasch mentioned this pull request Apr 20, 2026
2 tasks
dkarrasch added a commit that referenced this pull request Apr 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 1.12 Change should be backported to release-1.12 backport 1.13 Change should be backported to release-1.13

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants