GitHub - gmarkall/cuda_cg: Conjugate Gradient solver written in CUDA

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
test		test
.gitignore		.gitignore
CHANGES		CHANGES
README		README
gpu_solve.cu		gpu_solve.cu
gpu_solve.h		gpu_solve.h

Repository files navigation

README for Conjugate Gradient GPU Solver - Graham Markall, 30/06/2009
=====================================================================

(Presentation referenced below is at http://www.doc.ic.ac.uk/~grm08/g_markall_fe_acceleration.ppt )

The included solver is not exactly the same as the one whose performance results are reported in the accompanying presentation - this version of the solver has been altered to make the code more straightforward, at the expense of some efficiency (more kernel launches are required).

The file gpu_solve.cu implements a jacobi-preconditioner and conjugate gradient solver using the Compressed Sparse Row matrix format. The CSR matrix is expected to contain FORTRAN indices in the row_ptr and col_idx - that is, the indices are expected to begin at 1, and not 0. The function csr_c_to_fortran function in test/sparse_conversions.h gives an example of how to convert to the 1-based indexing if you have 0-based indexing.

The solver should work on any G200-series GPU (280GTX, Tesla C1060 etc...).

To make use of the solver in your code use:

In FORTRAN:

  call gpucg_solve(row_ptr, size(row_ptr), col_idx, size(col_idx), val, size(val), rhs, size(rhs), x)

This assumes that a CSR matrix is is stored in the arrays row_ptr, col_idx and val. The known right-hand side vector is stored in rhs, and x is a vector in which the solution to the system will be placed. row_ptr and col_idx are arrays of integers, val, rhs and x are arrays of doubles.


In C:

  gpucg_solve_(*row_ptr, &size_row_ptr, *col_idx, &size_col_idx, *val, &size_val, *rhs, &size_rhs, *x)

This assumes that *row_ptr, *col_idx and *val contain pointers to the arrays representing the CSR matrix. *rhs is a pointer to the array storing the known right-hand side vector. *x is a pointer to an area of memory where the solution may be stored. These are all of type double*. size_row_ptr etc. all contain the the length of the respective array, and are integers.

To compile the solver, use:

  nvcc -arch sm_XX -c gpu_solve.cu -o gpu_solve.o

Then link this with your code in the usual way.

Testing
=======

There is a test for the CG solver in the test directory. You may need to adjust the Makefile to suit your system. The test solves a small system and checks that the solution is sufficiently close to a reference solution. To run the tests:

cd test
make
./test


Problems/feedback
=================

If you have any issues using this code please contact me at [email protected] and I will attempt to be of assistance.