A few days ago, the release of CUDA 5.5. Unfortunately, the main number of innovations and convenience concerns the owners of video cards with Compute Capability 3.5.
But there is something that will suit all users of major Linux distributions - repositories have appeared!
For a complete list, see the Release Notes [ pdf ]. Under the cut list of what seemed to me the most interesting. ')
MPS (Multi-Process Service) allows several processes in an MPI program to transparently use the same GPU with Compute Capavility 3.5
Added support for ARMv7 (but not everything has been finally tested, see Release Notes)
CUFFT, CUPTI, CURAND, CUSPARSE, Thrust libraries have been updated.
Under MacOS, Clang is used as a compiler.
You can now debug programs using cuda-gdb on one video card if it has Compute Capavility 3.5 (two were required before)
Remote debugging supported
Now you can debug MPI applications
Visual Profiler supports profiling of programs with dynamic parallelism (those in which kernels from kernels are invoked)
All the above features about debugging, compiling and profiling are available in the updated Nsight Eclipse Edition
At the moment, according to Wikipedia , CC 3.5 is on the Tesla K20X, Tesla K20, GeForce GTX TITAN and GTX 780 (all based on the GK110). Tesla are designed exclusively for calculations (there are no video outputs in them), the last two video cards are “game”.