Hi, Habr. I thought someone could use parallel sorting with a relatively simple implementation and high performance on the CUDA platform. This is the bubble sorting method. Under the cat, an explanation is given and a code that may be useful (or maybe not ...). At once I will say that the presented program is a benchmark in comparison of performance on the GPU and CPU. If you do not mind, reader, then compile it, please, and put the calculation results in the comments of this article. This is not for science. Just interesting =)
I will remind a little
The essence of the bubble sorting method is pair-wise comparison of elements and replacing them with places if a certain condition is met, depending on the sorting direction. The following shows the iterations of the algorithm in single-threaded mode, when it sorts an increasing array of integers in descending order (each line is the state of the array at the moment) - this is Table 1.
Table 2 shows the iterations of bubble sorting in multithreaded mode, which is extremely simple: while one element moves to the side, after 2 movements the next element can also be sorted by the bubble method and starts moving, and after 2 more one and so on Thus, the entire array starts to “float up”, which significantly reduces the time for calculation.
Profit
Already by the number of expended actions, you can see that parallel sorting is faster. But this is not always: with small sizes of arrays, single-threaded mode runs faster due to the lack of problems with data transfer. I spent here a little test calculations, and here is their result:

The abscissa axis is the number of elements, the ordinate is the execution time in seconds.
Implementation
I present the implementation code in CUDA , which contains the code that performs sorting on DEPLOYMENT on CUDA and CPU (in single-threaded mode). In fact, this is a benchmark for your company on how smart sorting works on the video card and on the process. Part of this program is the sorting function on the video card on the CUDA platform, which you can use to your health in your masterpieces by elementary copy'n'past. The code is presented below the tables.
How to compile it ???
Well, for this you need to download drivers for CUDA from the Nvidia website, install them, download the Nvidia Toolkit, install and the Nvidia SDK, also install. Fortunately, now everything is already in one package. How to install CUDA I will write if there is a demand, and so just wish good luck if you are a beginner.
When everything is ready and nvidia CUDA works on your computer, then it is enough just to compile the main.cu file (or whatever you called it) and run the program, get the result and put it here.
Wat and all on now, thanks for reading. See you later.
PS I would be super grateful if you tell me how to speed up the algorithm =)
Table 1: Bubble Sorting in Single-threaded Mode
0 | one | 2 | 3 | four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 |
one | 0 | 2 | 3 | four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 |
one | 2 | 0 | 3 | four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 |
one | 2 | 3 | 0 | four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 |
one | 2 | 3 | four | 0 | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 |
one | 2 | 3 | four | five | 0 | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 |
one | 2 | 3 | four | five | 6 | 0 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 |
one | 2 | 3 | four | five | 6 | 7 | 0 | eight | 9 | ten | eleven | 12 | 13 | 14 |
one | 2 | 3 | four | five | 6 | 7 | eight | 0 | 9 | ten | eleven | 12 | 13 | 14 |
one | 2 | 3 | four | five | 6 | 7 | eight | 9 | 0 | ten | eleven | 12 | 13 | 14 |
one | 2 | 3 | four | five | 6 | 7 | eight | 9 | ten | 0 | eleven | 12 | 13 | 14 |
one | 2 | 3 | four | five | 6 | 7 | eight | 9 | ten | eleven | 0 | 12 | 13 | 14 |
one | 2 | 3 | four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 0 | 13 | 14 |
one | 2 | 3 | four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 0 |
one | 2 | 3 | four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 | 0 |
one | 2 | 3 | four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 |
2 | one | 3 | four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 | 0 |
2 | 3 | one | four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 | 0 |
2 | 3 | four | one | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 | 0 |
2 | 3 | four | five | one | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 | 0 |
2 | 3 | four | five | 6 | one | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 | 0 |
2 | 3 | four | five | 6 | 7 | one | eight | 9 | ten | eleven | 12 | 13 | 14 | 0 |
2 | 3 | four | five | 6 | 7 | eight | one | 9 | ten | eleven | 12 | 13 | 14 | 0 |
2 | 3 | four | five | 6 | 7 | eight | 9 | one | ten | eleven | 12 | 13 | 14 | 0 |
2 | 3 | four | five | 6 | 7 | eight | 9 | ten | one | eleven | 12 | 13 | 14 | 0 |
2 | 3 | four | five | 6 | 7 | eight | 9 | ten | eleven | one | 12 | 13 | 14 | 0 |
2 | 3 | four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | one | 13 | 14 | 0 |
2 | 3 | four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | one | 14 | 0 |
2 | 3 | four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 | one |
2 | 3 | four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 |
3 | 2 | four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 | one | 0 |
3 | four | 2 | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 | one | 0 |
3 | four | five | 2 | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 | one | 0 |
3 | four | five | 6 | 2 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 | one | 0 |
3 | four | five | 6 | 7 | 2 | eight | 9 | ten | eleven | 12 | 13 | 14 | one | 0 |
3 | four | five | 6 | 7 | eight | 2 | 9 | ten | eleven | 12 | 13 | 14 | one | 0 |
3 | four | five | 6 | 7 | eight | 9 | 2 | ten | eleven | 12 | 13 | 14 | one | 0 |
3 | four | five | 6 | 7 | eight | 9 | ten | 2 | eleven | 12 | 13 | 14 | one | 0 |
3 | four | five | 6 | 7 | eight | 9 | ten | eleven | 2 | 12 | 13 | 14 | one | 0 |
3 | four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 2 | 13 | 14 | one | 0 |
3 | four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 2 | 14 | one | 0 |
3 | four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 | 2 | one | 0 |
3 | four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 |
four | 3 | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 | 2 | one | 0 |
four | five | 3 | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 | 2 | one | 0 |
four | five | 6 | 3 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 | 2 | one | 0 |
four | five | 6 | 7 | 3 | eight | 9 | ten | eleven | 12 | 13 | 14 | 2 | one | 0 |
four | five | 6 | 7 | eight | 3 | 9 | ten | eleven | 12 | 13 | 14 | 2 | one | 0 |
four | five | 6 | 7 | eight | 9 | 3 | ten | eleven | 12 | 13 | 14 | 2 | one | 0 |
four | five | 6 | 7 | eight | 9 | ten | 3 | eleven | 12 | 13 | 14 | 2 | one | 0 |
four | five | 6 | 7 | eight | 9 | ten | eleven | 3 | 12 | 13 | 14 | 2 | one | 0 |
four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 3 | 13 | 14 | 2 | one | 0 |
four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 3 | 14 | 2 | one | 0 |
four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 | 3 | 2 | one | 0 |
four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 | one | 0 |
five | four | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 | 3 | 2 | one | 0 |
five | 6 | four | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 | 3 | 2 | one | 0 |
five | 6 | 7 | four | eight | 9 | ten | eleven | 12 | 13 | 14 | 3 | 2 | one | 0 |
five | 6 | 7 | eight | four | 9 | ten | eleven | 12 | 13 | 14 | 3 | 2 | one | 0 |
five | 6 | 7 | eight | 9 | four | ten | eleven | 12 | 13 | 14 | 3 | 2 | one | 0 |
five | 6 | 7 | eight | 9 | ten | four | eleven | 12 | 13 | 14 | 3 | 2 | one | 0 |
five | 6 | 7 | eight | 9 | ten | eleven | four | 12 | 13 | 14 | 3 | 2 | one | 0 |
five | 6 | 7 | eight | 9 | ten | eleven | 12 | four | 13 | 14 | 3 | 2 | one | 0 |
five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | four | 14 | 3 | 2 | one | 0 |
five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 | four | 3 | 2 | one | 0 |
five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 | 2 | one | 0 |
6 | five | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 | four | 3 | 2 | one | 0 |
6 | 7 | five | eight | 9 | ten | eleven | 12 | 13 | 14 | four | 3 | 2 | one | 0 |
6 | 7 | eight | five | 9 | ten | eleven | 12 | 13 | 14 | four | 3 | 2 | one | 0 |
6 | 7 | eight | 9 | five | ten | eleven | 12 | 13 | 14 | four | 3 | 2 | one | 0 |
6 | 7 | eight | 9 | ten | five | eleven | 12 | 13 | 14 | four | 3 | 2 | one | 0 |
6 | 7 | eight | 9 | ten | eleven | five | 12 | 13 | 14 | four | 3 | 2 | one | 0 |
6 | 7 | eight | 9 | ten | eleven | 12 | five | 13 | 14 | four | 3 | 2 | one | 0 |
6 | 7 | eight | 9 | ten | eleven | 12 | 13 | five | 14 | four | 3 | 2 | one | 0 |
6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 | five | four | 3 | 2 | one | 0 |
6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 | 3 | 2 | one | 0 |
7 | 6 | eight | 9 | ten | eleven | 12 | 13 | 14 | five | four | 3 | 2 | one | 0 |
7 | eight | 6 | 9 | ten | eleven | 12 | 13 | 14 | five | four | 3 | 2 | one | 0 |
7 | eight | 9 | 6 | ten | eleven | 12 | 13 | 14 | five | four | 3 | 2 | one | 0 |
7 | eight | 9 | ten | 6 | eleven | 12 | 13 | 14 | five | four | 3 | 2 | one | 0 |
7 | eight | 9 | ten | eleven | 6 | 12 | 13 | 14 | five | four | 3 | 2 | one | 0 |
7 | eight | 9 | ten | eleven | 12 | 6 | 13 | 14 | five | four | 3 | 2 | one | 0 |
7 | eight | 9 | ten | eleven | 12 | 13 | 6 | 14 | five | four | 3 | 2 | one | 0 |
7 | eight | 9 | ten | eleven | 12 | 13 | 14 | 6 | five | four | 3 | 2 | one | 0 |
7 | eight | 9 | ten | eleven | 12 | 13 | 14 | four | 3 | 2 | one | 0 |
eight | 7 | 9 | ten | eleven | 12 | 13 | 14 | 6 | five | four | 3 | 2 | one | 0 |
eight | 9 | 7 | ten | eleven | 12 | 13 | 14 | 6 | five | four | 3 | 2 | one | 0 |
eight | 9 | ten | 7 | eleven | 12 | 13 | 14 | 6 | five | four | 3 | 2 | one | 0 |
eight | 9 | ten | eleven | 7 | 12 | 13 | 14 | 6 | five | four | 3 | 2 | one | 0 |
eight | 9 | ten | eleven | 12 | 7 | 13 | 14 | 6 | five | four | 3 | 2 | one | 0 |
eight | 9 | ten | eleven | 12 | 13 | 7 | 14 | 6 | five | four | 3 | 2 | one | 0 |
eight | 9 | ten | eleven | 12 | 13 | 14 | 7 | 6 | five | four | 3 | 2 | one | 0 |
eight | 9 | ten | eleven | 12 | 13 | 14 | five | four | 3 | 2 | one | 0 |
9 | eight | ten | eleven | 12 | 13 | 14 | 7 | 6 | five | four | 3 | 2 | one | 0 |
9 | ten | eight | eleven | 12 | 13 | 14 | 7 | 6 | five | four | 3 | 2 | one | 0 |
9 | ten | eleven | eight | 12 | 13 | 14 | 7 | 6 | five | four | 3 | 2 | one | 0 |
9 | ten | eleven | 12 | eight | 13 | 14 | 7 | 6 | five | four | 3 | 2 | one | 0 |
9 | ten | eleven | 12 | 13 | eight | 14 | 7 | 6 | five | four | 3 | 2 | one | 0 |
9 | ten | eleven | 12 | 13 | 14 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
9 | ten | eleven | 12 | 13 | 14 | 6 | five | four | 3 | 2 | one | 0 |
ten | 9 | eleven | 12 | 13 | 14 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
ten | eleven | 9 | 12 | 13 | 14 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
ten | eleven | 12 | 9 | 13 | 14 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
ten | eleven | 12 | 13 | 9 | 14 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
ten | eleven | 12 | 13 | 14 | 9 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
ten | eleven | 12 | 13 | 14 | 7 | 6 | five | four | 3 | 2 | one | 0 |
eleven | ten | 12 | 13 | 14 | 9 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
eleven | 12 | ten | 13 | 14 | 9 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
eleven | 12 | 13 | ten | 14 | 9 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
eleven | 12 | 13 | 14 | ten | 9 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
eleven | 12 | 13 | 14 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
12 | eleven | 13 | 14 | ten | 9 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
12 | 13 | eleven | 14 | ten | 9 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
12 | 13 | 14 | eleven | ten | 9 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
12 | 13 | 14 | 9 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
13 | 12 | 14 | eleven | ten | 9 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
13 | 14 | 12 | eleven | ten | 9 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
13 | 14 | ten | 9 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
14 | 13 | 12 | eleven | ten | 9 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
')
Table 2: Bubble Sorting in Multi-threaded Mode
0 | one | 2 | 3 | four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 |
0 | one | 2 | 3 | four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 |
0 | one | 2 | 3 | four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 |
one | 0 | 2 | 3 | four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 |
one | 2 | 0 | 3 | four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 |
2 | one | 3 | 0 | four | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 |
2 | 3 | one | four | 0 | five | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 |
3 | 2 | four | one | five | 0 | 6 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 |
3 | four | 2 | five | one | 6 | 0 | 7 | eight | 9 | ten | eleven | 12 | 13 | 14 |
four | 3 | five | 2 | 6 | one | 7 | 0 | eight | 9 | ten | eleven | 12 | 13 | 14 |
four | five | 3 | 6 | 2 | 7 | one | eight | 0 | 9 | ten | eleven | 12 | 13 | 14 |
five | four | 6 | 3 | 7 | 2 | eight | one | 9 | 0 | ten | eleven | 12 | 13 | 14 |
five | 6 | four | 7 | 3 | eight | 2 | 9 | one | ten | 0 | eleven | 12 | 13 | 14 |
6 | five | 7 | four | eight | 3 | 9 | 2 | ten | one | eleven | 0 | 12 | 13 | 14 |
6 | 7 | five | eight | four | 9 | 3 | ten | 2 | eleven | one | 12 | 0 | 13 | 14 |
7 | 6 | eight | five | 9 | four | ten | 3 | eleven | 2 | 12 | one | 13 | 0 | 14 |
7 | eight | 6 | 9 | five | ten | four | eleven | 3 | 12 | 2 | 13 | one | 14 | 0 |
eight | 7 | 9 | 6 | ten | five | eleven | four | 12 | 3 | 13 | 2 | 14 | one | 0 |
eight | 9 | 7 | ten | 6 | eleven | five | 12 | four | 13 | 3 | 14 | 2 | one | 0 |
9 | eight | ten | 7 | eleven | 6 | 12 | five | 13 | four | 14 | 3 | 2 | one | 0 |
9 | ten | eight | eleven | 7 | 12 | 6 | 13 | five | 14 | four | 3 | 2 | one | 0 |
ten | 9 | eleven | eight | 12 | 7 | 13 | 6 | 14 | five | four | 3 | 2 | one | 0 |
ten | eleven | 9 | 12 | eight | 13 | 7 | 14 | 6 | five | four | 3 | 2 | one | 0 |
eleven | ten | 12 | 9 | 13 | eight | 14 | 7 | 6 | five | four | 3 | 2 | one | 0 |
eleven | 12 | ten | 13 | 9 | 14 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
12 | eleven | 13 | ten | 14 | 9 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
12 | 13 | eleven | 14 | ten | 9 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
13 | 12 | 14 | eleven | ten | 9 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
13 | 14 | 12 | eleven | ten | 9 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
14 | 13 | 12 | eleven | ten | 9 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
14 | 13 | 12 | eleven | ten | 9 | eight | 7 | 6 | five | four | 3 | 2 | one | 0 |
Code:
#include <stdio.h> #include <time.h> __global__ void bubbleMove(int *array_device, int N, int step){ int idx = blockDim.x * blockIdx.x + threadIdx.x; if (idx<(N-1)) { if (step-2>=idx){ if (array_device[idx]<array_device[idx+1]){ int buffer = array_device[idx]; array_device[idx]=array_device[idx+1]; array_device[idx+1] = buffer; } } } } void bubleSortCUDA(int *array_host,int N, int blockSize){ int *array_device; cudaMalloc((void **)&array_device, N * sizeof(int)); for (int i = 0; i < N; i++) array_host[i]=i; cudaMemcpy(array_device,array_host,N*sizeof(int),cudaMemcpyHostToDevice); int nblocks = N / blockSize+1; for (int step = 0; step <= N+N; step++) { bubbleMove<<<nblocks,blockSize>>>(array_device,N,step); cudaThreadSynchronize(); } cudaMemcpy(array_host,array_device,N*sizeof(int),cudaMemcpyDeviceToHost); cudaFree(array_device); } void bubleSortCPU(int *array_host, int N){ for (int i = 0; i < N; i++){ for (int j = 0; j < Ni-1; j++) { if (array_host[j]<array_host[j+1]){ int buffer = array_host[j]; array_host[j]=array_host[j+1]; array_host[j+1] = buffer; } } } } int checkArray(int *array_host, int N){ int good=1; for (int i = 0; i < N-1; i++) if (array_host[i]<array_host[i+1]) {good=0; printf("i=%da=%d\n", i,array_host[i] );} return good; } float measureCUDA(int N,int blockSize){ int *array_host = (int *)malloc(N * sizeof(int)); for (int i = 0; i < N; i++) array_host[i]=i; clock_t start = clock(); bubleSortCUDA(array_host,N,blockSize); clock_t end = clock(); if (checkArray(array_host,N)==1){ free(array_host); return (float)(end - start) / CLOCKS_PER_SEC; } else { free(array_host); return -1; } } float measureCPU(int N) { int *array_host = (int *)malloc(N * sizeof(int)); for (int i = 0; i < N; i++) array_host[i]=i; clock_t start = clock(); bubleSortCPU(array_host,N); clock_t end = clock(); if (checkArray(array_host,N)==1){ free(array_host); return (float)(end - start) / CLOCKS_PER_SEC; } else { free(array_host); return -1; } } int main(int argc, char const *argv[]) { for (int i = 1; i < 10000000; i*=2) printf("%d %f\t%f\n",i, measureCUDA(i,256),measureCPU (i )); return 0; }