-
-
Notifications
You must be signed in to change notification settings - Fork 67
Add CUSOLVERRF.jl integration for GPU-accelerated sparse LU factorization #651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remaining comments which cannot be posted as a review comment to avoid GitHub Rate Limit
JuliaFormatter
[JuliaFormatter] reported by reviewdog 🐶
LinearSolve.jl/test/gpu/cusolverrf.jl
Line 84 in 35073d8
[JuliaFormatter] reported by reviewdog 🐶
LinearSolve.jl/test/gpu/cusolverrf.jl
Line 88 in 35073d8
end |
@info "CUDA not available, skipping CUSOLVERRF tests" | ||
return | ||
end | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
n = 100 | ||
A = sprand(n, n, 0.1) + I | ||
b = rand(n) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
# Test with CPU sparse matrix (should auto-convert to GPU) | ||
@testset "CPU Sparse Matrix" begin | ||
prob = LinearProblem(A, b) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
# Test with default symbolic (:RF) | ||
sol = solve(prob, CUSOLVERRFFactorization()) | ||
@test norm(A * sol.u - b) / norm(b) < 1e-10 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
sol_klu = solve(prob, CUSOLVERRFFactorization(symbolic = :KLU)) | ||
@test norm(A * sol_klu.u - b) / norm(b) < 1e-10 | ||
end | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
@testset "GPU Sparse Matrix" begin | ||
A_gpu = CUDA.CUSPARSE.CuSparseMatrixCSR(A) | ||
b_gpu = CuArray(b) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
|
||
prob_gpu = LinearProblem(A_gpu, b_gpu) | ||
sol_gpu = solve(prob_gpu, CUSOLVERRFFactorization()) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
res_gpu = A_gpu * sol_gpu.u - b_gpu | ||
@test norm(res_gpu) / norm(b_gpu) < 1e-10 | ||
end | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
# Create a new matrix with same pattern but different values | ||
A2 = A + 0.1 * sprand(n, n, 0.01) | ||
b2 = rand(n) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
sol2 = solve(prob2, CUSOLVERRFFactorization(reuse_symbolic = true)) | ||
@test norm(A2 * sol2.u - b2) / norm(b2) < 1e-10 | ||
end | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
Should these be types rather than symbols for type stability? |
…tion This PR adds support for NVIDIA's cusolverRF sparse LU factorization library through a package extension. CUSOLVERRF provides high-performance GPU-accelerated factorization for sparse matrices. Key features: - New `CUSOLVERRFFactorization` algorithm with configurable symbolic factorization (RF or KLU) - Automatic CPU-to-GPU conversion for convenience - Support for multiple right-hand sides - Reusable symbolic factorization for matrices with same sparsity pattern - Adjoint solve support - Comprehensive test suite The implementation follows LinearSolve.jl's extension pattern, similar to the existing CUDSS integration. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Include CUSOLVERRF tests in the GPU test suite when the package is available. The tests are conditionally included to avoid failures when CUSOLVERRF.jl is not installed. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Added CUSOLVERRF to recommended methods for sparse matrices - Added CUSOLVERRF section in the full list of solvers - Added CUSOLVERRF examples in GPU tutorial documentation - Documented supported options and limitations 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Updated sparse matrices recommendation to include both CUDSS.jl and CUSOLVERRF.jl - Clarified that CUDSS provides interface to NVIDIA's cuDSS library - Maintained that both offer high performance for GPU-accelerated sparse LU factorization 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Clarified that CUDSS works through LUFactorization() when CUDSS.jl is loaded - Explained that it automatically uses cuDSS for CuSparseMatrixCSR arrays - Removed incorrect reference to a separate CUDSS factorization type 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
8e3ce1a
to
d7f1f8c
Compare
CUSOLVERRF provides access to NVIDIA's cusolverRF library, which offers significant | ||
performance improvements for sparse LU factorization on GPUs. It supports both | ||
`:RF` (default) and `:KLU` symbolic factorization methods, and can reuse symbolic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
CUSOLVERRF provides access to NVIDIA's cusolverRF library, which offers significant | |
performance improvements for sparse LU factorization on GPUs. It supports both | |
`:RF` (default) and `:KLU` symbolic factorization methods, and can reuse symbolic | |
CUSOLVERRF provides access to NVIDIA's cusolverRF library, which offers significant | |
performance improvements for sparse LU factorization on GPUs. It supports both | |
`:RF` (default) and `:KLU` symbolic factorization methods, and can reuse symbolic |
end | ||
|
||
function LinearSolve.init_cacheval(alg::LinearSolve.CUSOLVERRFFactorization, | ||
A::Union{CuSparseMatrixCSR{Float64, Int32}, SparseMatrixCSC{Float64, <:Integer}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
A::Union{CuSparseMatrixCSR{Float64, Int32}, SparseMatrixCSC{Float64, <:Integer}}, | |
A::Union{CuSparseMatrixCSR{Float64, Int32}, SparseMatrixCSC{Float64, <:Integer}}, |
symbolic = alg.symbolic | ||
# Convert to CuSparseMatrixCSR if needed | ||
A_gpu = A isa CuSparseMatrixCSR ? A : CuSparseMatrixCSR(A) | ||
RFLU(A_gpu; nrhs=nrhs, symbolic=symbolic) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
RFLU(A_gpu; nrhs=nrhs, symbolic=symbolic) | |
RFLU(A_gpu; nrhs = nrhs, symbolic = symbolic) |
|
||
function SciMLBase.solve!(cache::LinearSolve.LinearCache, alg::LinearSolve.CUSOLVERRFFactorization; kwargs...) | ||
A = cache.A | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
else | ||
error("CUSOLVERRFFactorization only supports SparseMatrixCSC or CuSparseMatrixCSR matrices") | ||
end | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
if cacheval === nothing | ||
# Create new factorization | ||
nrhs = cache.b isa AbstractMatrix ? size(cache.b, 2) : 1 | ||
fact = RFLU(A_gpu; nrhs=nrhs, symbolic=alg.symbolic) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
fact = RFLU(A_gpu; nrhs=nrhs, symbolic=alg.symbolic) | |
fact = RFLU(A_gpu; nrhs = nrhs, symbolic = alg.symbolic) |
else | ||
# Create new factorization if pattern changed | ||
nrhs = cache.b isa AbstractMatrix ? size(cache.b, 2) : 1 | ||
fact = RFLU(A_gpu; nrhs=nrhs, symbolic=alg.symbolic) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
fact = RFLU(A_gpu; nrhs=nrhs, symbolic=alg.symbolic) | |
fact = RFLU(A_gpu; nrhs = nrhs, symbolic = alg.symbolic) |
cache.cacheval = fact | ||
cache.isfresh = false | ||
end | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
end | ||
|
||
F = @get_cacheval(cache, :CUSOLVERRFFactorization) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
# Ensure b and u are on GPU | ||
b_gpu = cache.b isa CUDA.CuArray ? cache.b : CUDA.CuArray(cache.b) | ||
u_gpu = cache.u isa CUDA.CuArray ? cache.u : CUDA.CuArray(cache.u) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
# Solve | ||
copyto!(u_gpu, b_gpu) | ||
ldiv!(F, u_gpu) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
|
||
SciMLBase.build_linear_solution(alg, cache.u, nothing, cache; retcode = ReturnCode.Success) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
SciMLBase.build_linear_solution(alg, cache.u, nothing, cache; retcode = ReturnCode.Success) | |
SciMLBase.build_linear_solution( | |
alg, cache.u, nothing, cache; retcode = ReturnCode.Success) |
|
||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
end | |
end |
A_f32 = Float32.(A) | ||
b_f32 = Float32.(b) | ||
prob_f32 = LinearProblem(A_f32, b_f32) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
# This should error since CUSOLVERRF only supports Float64 | ||
@test_throws Exception solve(prob_f32, CUSOLVERRFFactorization()) | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[JuliaFormatter] reported by reviewdog 🐶
end | |
end |
Summary
This PR adds support for NVIDIA's cusolverRF sparse LU factorization library through a package extension, providing high-performance GPU-accelerated solving for sparse linear systems.
Motivation
CUSOLVERRF.jl provides access to NVIDIA's cusolverRF library, which offers significant performance improvements for sparse LU factorization on GPUs. This integration makes it accessible through LinearSolve.jl's unified interface.
Key Features
CUSOLVERRFFactorization
algorithm with configurable options:symbolic
: Choose between:RF
(default) or:KLU
for symbolic factorizationreuse_symbolic
: Reuse symbolic factorization for matrices with same sparsity patternImplementation Details
The implementation follows LinearSolve.jl's extension pattern:
ext/LinearSolveCUSOLVERRFExt.jl
src/factorization.jl
andsrc/LinearSolve.jl
Project.toml
test/gpu/cusolverrf.jl
Usage Example
Limitations
Float64
element types withInt32
indices (CUSOLVERRF limitation)Testing
Tests have been added to the GPU test suite and can be run with appropriate hardware.
🤖 Generated with Claude Code