Often, in applied mathematical and computer models there is a need to solve systems of linear algebraic equations (
SLAE ). As a rule, in practice, the matrix in such SLAEs is sparse. For example, sparse matrices are found in models with finite difference or finite element methods for solving differential equations. Highly sparse matrices of large dimension arise in the simulation of material and information flows in large technological networks (gas supply and gas distribution systems, sewer and heat supply systems, power grids and computer networks, etc.). Common to technological networks is the representation of their models in the form of a graph, in which the incidence matrix is almost always highly sparse.
The article will describe how your humble servant significantly increased the efficiency of the computer model for calculating unsteady gas flows in large gas supply systems of arbitrary configuration, thanks to the use of the ManagedCuda library and nVidia CUDA 7.0. However, the presentation will be conducted without reference to a specific subject area.
Formulation of the problem
The classical problem of solving SLAEs is considered:
Ax = bHere the matrix
A of size
n x
n ,
x is the vector of unknowns of size
n ,
b is the vector of known free members of size
n .
The author of the article is engaged in the development of a specialized software and computing complex (
PVK ) for modeling and optimization of non-stationary gas flows in gas supply systems. In the problems I solve,
A is a positive definite Jacobian with a diagonal predominance of a certain system of nonlinear equations.
A is obtained as a result of matrix transformations of the matrix of incidents of the graph of the gas supply system and other matrices. In my practical calculations,
A is usually filled by 6-10%, the rest is zero. The size of it depends on the size of a particular gas supply system and varies from 10 to 1000.
')
Our PVC is developed for .NET 4.0 in C #. The computational module and all mathematics are also being developed in C #. Initially, to solve the SLAU, I wrote my own, homegrown, library that does not use the technology of sparse matrices. To solve SLAE applied the
LU decomposition method. And for the time being, everything suited me, until I began to work on the task of optimizing non-stationary gas flow regimes, where it is necessary to perform a large number of searches of the control parameter values using dynamic programming and, accordingly, solve SLAEs many times. The standard Visual Studio profiler showed that during the program execution, about 40% of the costs are due to the SLAE solution.
It was at that moment that I realized that it was time to change something.
Mathematical Library Math.Net Numerics
After analyzing the existing mathematical libraries under .Net, I decided to dwell on the Math.Net Numerics library. You can familiarize yourself with it
here .
I was interested in its key features:
- Implemented sparse matrices and vectors technology;
- There are built-in all the necessary vector-matrix operations;
- There are embedded solvers;
- Intuitive API.
Let me give you a listing with an example of the solution of SLAH using Math.Net Numerics:
SparseMatrix matrix = new SparseMatrix(n); double[] rightSide = new double[n];
As you can see, everything looks very elegant, but almost immediately I was disappointed - the built-in solver Math.Net Numerics worked much slower than mine. This did not suit me at all, however, the claims to vector-matrix operations implemented in the library are not so significant. Therefore, I did not completely abandon Math.Net Numerics, leaving the code where we work with vectors and matrices.
But here I remembered the very successful experience of my postgraduate colleagues, who used graphic processors to solve problems of underground hydrodynamics. I have a laptop with a nVidia GeeForce GT 540M graphics card with 2 GB of memory and support for CUDA technology. It was decided to try this technology in practice, which I no longer regret.
ManagedCuda Library
There is a large amount of material on CUDA technology on this site, so a curious reader will easily find it. I will dwell on how I solved the task.
I was interested in the
cuSPARSE library. When I first met her, I came across problems:
- Direct work with the library is possible only through C / C ++. Of course, the problem will be solved if you use a quality wrapper, who really did not want to write. According to the search results, a CUDA wrapper for .Net called Cudafy was found .
- cuSPARCE allows you to solve SLAEs with triangular matrices, and in my case the matrices are not. However, cuSPARSE contains matrix factorization methods that bring them to a triangular form ( LU factorization method, Cholesky decomposition). However, there were no corresponding methods in the Cudafy API, so Cudafy had to be abandoned.
After continuing the search, I discovered the
ManagedCuda library, which is a high-level wrapper for the CUDA API. I will not list all its capabilities - they can be found on the official website, but I’ll dwell on how you can, using Math.Net Numerics and ManagedCuda, elegantly and effectively solve SLAE.
The idea is to use SparseMatrix from Math.Net Numerics to form and store a sparse matrix in
CSR format, which is taken as input to cuSPARSE and ManagedCuda. The following is a listing of the relevant program:
SparseMatrix matrix = new SparseMatrix(n); double[] rightSide = new double[n]; // //... var storage = matrix.Storage as SparseCompressedRowMatrixStorage<double>; // // CSR var nonZeroValues = storage.EnumerateNonZero().ToArray(); // double[] x = new double[matrix.RowCount]; // CudaSolveSparse sp = new CudaSolveSparse(); // ManagedCuda CudaSparseMatrixDescriptor matrixDescriptor = new CudaSparseMatrixDescriptor(); // double tolerance = 0.00001; // . sp.CsrlsvluHost(matrix.RowCount, nonZeroValues.Length, matrixDescriptor, nonZeroValues, storage.RowPointers, storage.ColumnIndices, rightSide, tolerance, 0, x); // LU
Computational experiment: analysis of time spent on the solution of SLAE by various technologies
In order not to bore the reader with dry text and program listings, consider the results of calculations of SLAEs of dimensions from 50 to 500 using various technologies:
- Homegrown solver, written by the author. Let's call it Mani.Net.
- Standard solver library Math.Net Numerics.
- LU decomposition method of the ManagedCuda library.
Matrix
A is filled at 10% randomly.

The figure clearly demonstrates why I had to abandon Math.Net Numerics to solve SLAE.
Note, with the dimension of the matrix equal to 500, my own solver (Mani.Net) took 1170 ms, Math.Net Numerics — 12968 ms, ManagedCuda — 70 ms.
Conclusion
We should expect comments in the comments, saying that not all machines have nVidia GPUs with CUDA support. Indeed it is. Therefore, our application is configured in two compilation configurations: with our own SLAE solution library and with ManagedCuda.
As for Mani.Net, I note that this is not an advertisement for my own library. It is impossible to find it anywhere and I will not pass it on to anyone. No, I'm not greedy. I am ashamed of the code.
Thanks for reading the article! I will be glad to know about your opinions and comments in the comments.