Rh
and Rmath.h
(if mathematical functions R are used) #include <Rh> void iprod(double *v1, double *v2, int *n, double *s) { double ret = 0; int len = *n; for (int i = 0; i < len; i++) { ret += v1[i] * v2[i]; } *s = ret; }
R CMD SHLIB inner_prod.c
dyn.load()
function. To call the function itself on C, .C()
use .C()
(there is also .Call()
and. .External()
, but with slightly different functionality, and hot arguments are sometimes between supporters of .C () and .Call () ). I will only note that when writing C code to call through .C()
it turns out to be cleaner and more readable. Special attention should be paid to the correspondence of the types of variables in R and C (in the documentation on the function .C()
this is written in detail). Wrapper function on R: iprod <- function(a, b) { if (!is.loaded('iprod')) { dyn.load("inner_prod.so") } n <- length(a) v <- 0 return(.C("iprod", as.double(a), as.double(b), as.integer(n), as.double(v))[[4]]) }
> n <- 1e7; a <- rnorm(n); b <- rnorm(n); > iprod(a, b) [1] 3482.183
> sum(a * b) [1] 3482.183
nvidia-cuda-toolkit
package. CUDA, of course, deserves a separate huge topic, and since my level in this topic is “beginner”, I will not frighten people with my curves and unfinished code, but allow myself to copy a few lines from the manual.Thrust
, which allows us to abstract from low-level CUDA / C operations. The data are presented in the form of vectors, to which some standard algorithms are applied (elementwise operations, reductions, prefix-sums, sorting). #include <thrust/transform_reduce.h> #include <thrust/device_vector.h> #include <cmath> // , 6 GPU (device) template <typename T> struct power{ __device__ T operator()(const T& x) const{ return std::pow(x, 6); } }; extern "C" void nrm(double *v, int *n, double *vnorm) { // , GPU, *v thrust::device_vector<double> dv(v, v + *n); // Reduce- , .. power // . *vnorm = std::sqrt( thrust::transform_reduce(dv.begin(), dv.end(), power<double>(), 0.0, thrust::plus<double>()) ); }
extern "C"
necessary here, otherwise R will not see the function nrm (). To compile the code we will now use nvcc. Remember the output of the command R CMD SHLIB...
? Here we adapt it a bit so that a library using CUDA / Thrust can be called from R without any problems: nvcc -g -G -O2 -arch sm_30 -I/usr/share/R/include -Xcompiler "-Wall -fpic" -c thr.cu thr.o nvcc -shared -lm thr.o -o thr.so -L/usr/lib/R/lib -lR
gpunrm <- function(v) { if (!is.loaded('nrm')) dyn.load("thr.so") n <- length(v) vnorm <- 0 return(.C("nrm", as.double(v), as.integer(n), as.double(vnorm))[[3]]) }
gpu_time <- c() cpu_time <- c() n <- seq.int(1e4, 1e8, length.out=30) for (i in n) { v <- rnorm(i, 1000) gpu_time <- c(gpu_time, system.time({p1 <- gpunrm(v)})[1]) cpu_time <- c(cpu_time, system.time({p2 <- sqrt(sum(v^6))})[1]) }
gputools
) adapted for working with the GPU ( here you can read more about this).Source: https://habr.com/ru/post/220927/
All Articles