method1 method2 method3 [1,] 0.05517714 0.014054038 0.017260447 [2,] 0.08367678 0.003570883 0.004289079 [3,] 0.05274706 0.028629661 0.071323030 [4,] 0.06769936 0.048446559 0.057432519 [5,] 0.06875188 0.019782518 0.080564474 [6,] 0.04913779 0.100062929 0.102208706
rnorm
to create three sets. The first - with an average equal to 0, the second - with an average of 2, the third - with an average of 5, and 30 lines. m <- matrix(data=cbind(rnorm(30, 0), rnorm(30, 2), rnorm(30, 5)), nrow=30, ncol=3)
apply
and the base mean
function to verify this. The second argument we specify apply, to which dimension to apply the function - columns or rows. In this case, at the end we want to get three numbers, so we will specify apply
to work with columns, passing 2 as the second argument. But let's get it wrong to illustrate: apply(m, 1, mean)
# [1] 2.408150 2.709325 1.718529 0.822519 2.693614 2.259044 1.849530 2.544685 2.957950 2.219874 #[11] 2.582011 2.471938 2.015625 2.101832 2.189781 2.319142 2.504821 2.203066 2.280550 2.401297 #[21] 2.312254 1.833903 1.900122 2.427002 2.426869 1.890895 2.515842 2.363085 3.049760 2.027570
apply(m, 2, mean)
#[1] -0.02664418 1.95812458 4.86857792
apply
again: apply(m, 2, function(x) length(x[x<0]))
#[1] 14 1 0
apply
, and not some built-in one. Note that in the function we did not specify the return value. In fact, the function uses splitting into subsets to select all elements
less than 0, and then calculate them with the help of length
. The function takes one argument, which I arbitrarily denoted by
. In this case,
is one of the columns of the matrix. Is it a single-column matrix or just a vector? Let's get a look: apply(m, 2, function(x) is.matrix(x))
#[1] FALSE FALSE FALSE
is.matrix
, since it takes one argument and has already been created. Let's make sure these are vectors, as expected: apply(m, 2, is.vector)
#[1] TRUE TRUE TRUE
apply(m, 2, length(x[x<0]))
#Error in match.fun(FUN) : object 'x' not found
, but R knows nothing about it, and therefore gives an error. Other factors also play a role here, but for simplicity, remember to wrap any code in a function. For example, let's take a look at the average of only positive values: apply(m, 2, function(x) mean(x[x>0]))
#[1] 0.4466368 2.0415736 4.8685779
rollapply
for this, but a quick, though not quite beautiful way is to run sapply
or lapply
, passing a set of indexed values.sapply
, which works with a list or data vector: sapply(1:3, function(x) x^2)
#[1] 1 4 9
lapply
very similar, but returns a list, not a vector: lapply(1:3, function(x) x^2)
#[[1]] #[1] 1 # #[[2]] #[1] 4 # #[[3]] #[1] 9
sapply
simplify=FALSE
, also get the list: sapply(1:3, function(x) x^2, simplify=F)
#[[1]] #[1] 1 # #[[2]] #[1] 4 # #[[3]] #[1] 9
unlist
with lapply
to get a vector. unlist(lapply(1:3, function(x) x^2))
#[1] 1 4 9
lapply
and sapply
if it makes sense for your data and the expected result. If you want a list, use lapply
. If the vector is sapply
.sapply
vector of indices and write your function, making an assumption about the structure of the input data. Let's take another look at the example with mean
: sapply(1:3, function(x) mean(m[,x]))
[1] -0.02664418 1.95812458 4.86857792
m
with our data. Well as a quick solution, but in general, not very, and with a high probability in the future will turn into a big problem with the support. sapply(1:3, function(x, y) mean(y[,x]), y=m)
#[1] -0.02664418 1.95812458 4.86857792
and
. The variable
, as before, will denote the data that sapply
through, whatever that is. The variable
will be sapply
using the optional sapply
arguments.m
, explicitly setting the variable
when calling sapply
. This is not strictly necessary, but easier for the perception and maintenance of the code. The value of
will be the same every time we call our function in sapply
.Source: https://habr.com/ru/post/274611/
All Articles