Skip to content

Commit c1945c4

Browse files
author
shijiashuai
committed
docs: add bilingual README and update configs
1 parent 90a64a3 commit c1945c4

3 files changed

Lines changed: 79 additions & 5 deletions

File tree

README.en.md

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
# GPU SpMV (Sparse Matrix-Vector Multiplication)
2+
3+
[![CI](https://github.com/LessUp/gpu-spmv/actions/workflows/ci.yml/badge.svg)](https://github.com/LessUp/gpu-spmv/actions/workflows/ci.yml)
4+
5+
[简体中文](README.md) | English
6+
7+
High-performance CUDA sparse matrix-vector multiplication library supporting CSR and ELL formats with multiple load-balancing optimization strategies.
8+
9+
## Features
10+
11+
- **Multiple Sparse Formats** — CSR and ELL (ELLPACK)
12+
- **Optimized CUDA Kernels** — Scalar CSR, Vector CSR (warp-per-row), Merge Path, ELL
13+
- **Auto Kernel Selection** — Based on matrix characteristics
14+
- **Performance Metrics** — Bandwidth utilization, GFLOPS, benchmark framework
15+
- **Engineering Quality** — RAII (`CudaBuffer`, `CudaTimer`, `ScopedTexture`), semantic error codes, CMake Presets, CI
16+
17+
## Quick Start
18+
19+
```bash
20+
# Using CMake Presets
21+
cmake --preset release
22+
cmake --build --preset release
23+
24+
# Run tests
25+
ctest --preset default
26+
27+
# Run benchmarks
28+
./build/spmv_benchmark
29+
```
30+
31+
## Requirements
32+
33+
- CUDA Toolkit 11.0+, CMake 3.18+, C++17, GPU CC 7.0+
34+
35+
## Usage
36+
37+
```cpp
38+
#include "spmv/csr_matrix.h"
39+
#include "spmv/spmv.h"
40+
#include "spmv/cuda_buffer.h"
41+
42+
using namespace spmv;
43+
44+
CSRMatrix* csr = csr_create(0, 0, 0);
45+
csr_from_dense(csr, dense.data(), 3, 3);
46+
csr_to_gpu(csr);
47+
48+
SpMVConfig config = spmv_auto_config(csr);
49+
SpMVResult result = spmv_csr(csr, d_x.get(), d_y.get(), &config, 3);
50+
```
51+
52+
## Kernel Selection Strategy
53+
54+
- **Short rows (avg_nnz < 4)**: Scalar CSR
55+
- **Uniform distribution (skewness < 10)**: Vector CSR
56+
- **Highly skewed (skewness >= 10)**: Merge Path
57+
58+
## Project Structure
59+
60+
```
61+
├── include/spmv/ # Headers (common, cuda_buffer, csr, ell, spmv, benchmark, pagerank)
62+
├── src/ # Source files
63+
├── tests/ # Property tests + unit tests
64+
├── benchmarks/ # Benchmark program
65+
├── CMakePresets.json # Build presets
66+
└── .github/workflows/ # CI
67+
```
68+
69+
## License
70+
71+
MIT License

README.md

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,9 @@
11
# GPU SpMV (稀疏矩阵向量乘法)
22

3-
[![CI](https://github.com/<OWNER>/gpu-spmv/actions/workflows/ci.yml/badge.svg)](https://github.com/<OWNER>/gpu-spmv/actions/workflows/ci.yml)
3+
[![CI](https://github.com/LessUp/gpu-spmv/actions/workflows/ci.yml/badge.svg)](https://github.com/LessUp/gpu-spmv/actions/workflows/ci.yml)
4+
[![Docs](https://img.shields.io/badge/Docs-GitHub%20Pages-blue?logo=github)](https://lessup.github.io/gpu-spmv/)
5+
6+
简体中文 | [English](README.en.md)
47

58
基于 CUDA 的高性能稀疏矩阵向量乘法库,支持 CSR 和 ELL 格式,包含多种负载均衡优化策略。
69

@@ -228,11 +231,11 @@ std::string json = benchmark_to_json(result);
228231

229232
项目文档已通过 GitHub Pages 发布:
230233

231-
> **https://\<OWNER\>.github.io/gpu-spmv/**
234+
> **https://lessup.github.io/gpu-spmv/**
232235
233236
包含:
234-
- [API 参考](https://OWNER.github.io/gpu-spmv/api) — 头文件接口、数据结构与函数说明
235-
- [性能优化](https://OWNER.github.io/gpu-spmv/performance) — Kernel 选择策略、带宽优化与基准测试
237+
- [API 参考](https://lessup.github.io/gpu-spmv/api) — 头文件接口、数据结构与函数说明
238+
- [性能优化](https://lessup.github.io/gpu-spmv/performance) — Kernel 选择策略、带宽优化与基准测试
236239

237240
## 许可证
238241

docs/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -90,4 +90,4 @@ csr_destroy(csr);
9090
9191
## 许可证
9292
93-
[MIT License](https://github.com/OWNER/gpu-spmv/blob/main/LICENSE)
93+
[MIT License](https://github.com/LessUp/gpu-spmv/blob/main/LICENSE)

0 commit comments

Comments
 (0)