R_Extract the row and column of the element in use when using apply function

r apply function with multiple arguments
r apply function to each column
r apply function to each row of dataframe
r apply function to data frame columns
r apply custom function to each row
r lapply custom function
mapply function in r
r apply function with multiple arguments to data frame

How to extract the row and column of the element in use when using apply function? For example, say I want to apply a function for each element of the matrix where row and column number of the selected element are also variables in the function. A simple reproducible example is given below

mymatrix <- matrix(1:12, nrow=3, ncol=4)

I want a function which does the following

apply(mymatrix, c(1,2), function (x) sum(x, row_number, col_number))

where row_number and col_number are the row and column number of the selected element in mymatrix. Note that my function is more complicated than sum, so a robust solution is appreciated.

I'm not entirely sure what you're trying to do but I would use a for loop here.

Pre-allocate the return matrix and this will be very fast

ret <- mymatrix
for (i in 1:nrow(mymatrix))
    for (j in 1:ncol(mymatrix))
        ret[i, j] <- sum(mymatrix[i, j], i, j)
#     [,1] [,2] [,3] [,4]
#[1,]    3    7   11   15
#[2,]    5    9   13   17
#[3,]    7   11   15   19

Benchmark analysis 1

I was curious so I ran a microbenchmark analysis to compare methods; I used a bigger 200x300 matrix.

mymatrix <- matrix(1:600, nrow = 200, ncol = 300)
library(microbenchmark)
res <- microbenchmark(
    for_loop = {
        ret <- mymatrix
        for (i in 1:nrow(mymatrix))
            for (j in 1:ncol(mymatrix))
                ret[i, j] <- sum(mymatrix[i, j], i, j)
    },
    expand_grid_mapply = {
        newResult<- mymatrix
        grid1 <- expand.grid(1:nrow(mymatrix),1:ncol(mymatrix))
        newResult[]<-
        mapply(function(row_number, col_number){ sum(mymatrix[row_number, col_number], row_number, col_number) },row_number = grid1$Var1, col_number = grid1$Var2 )
    },
    expand_grid_apply = {
        newResult<- mymatrix
        grid1 <- expand.grid(1:nrow(mymatrix),1:ncol(mymatrix))
        newResult[]<-
        apply(grid1, 1, function(x){ sum(mymatrix[x[1], x[2]], x[1], x[2]) })
    },
    double_sapply = {
        sapply(1:ncol(mymatrix), function (x) sapply(1:nrow(mymatrix), function (y) sum(mymatrix[y,x],x,y)))
    }
)

res
#Unit: milliseconds
#               expr       min        lq      mean    median       uq       max
#           for_loop  41.42098  52.72281  56.86675  56.38992  59.1444  82.89455
# expand_grid_mapply 126.98982 161.79123 183.04251 182.80331 196.1476 332.94854
#  expand_grid_apply 295.73234 354.11661 375.39308 375.39932 391.6888 562.59317
#      double_sapply  91.80607 111.29787 120.66075 120.37219 126.0292 230.85411

library(ggplot2)
autoplot(res)

Benchmark analysis 2 (with expand.grid outside of microbenchmark)
grid1 <- expand.grid(1:nrow(mymatrix),1:ncol(mymatrix))
res <- microbenchmark(
    for_loop = {
        ret <- mymatrix
        for (i in 1:nrow(mymatrix))
            for (j in 1:ncol(mymatrix))
                ret[i, j] <- sum(mymatrix[i, j], i, j)
    },
    expand_grid_mapply = {
        newResult<- mymatrix
        newResult[]<-
        mapply(function(row_number, col_number){ sum(mymatrix[row_number, col_number], row_number, col_number) },row_number = grid1$Var1, col_number = grid1$Var2 )
    },
    expand_grid_apply = {
        newResult<- mymatrix
        newResult[]<-
        apply(grid1, 1, function(x){ sum(mymatrix[x[1], x[2]], x[1], x[2]) })
    }
)

res
#Unit: milliseconds
#               expr       min        lq      mean    median        uq      max
#           for_loop  39.65599  54.52077  60.87034  59.19354  66.64983  95.7890
# expand_grid_mapply 130.33573 167.68201 194.39764 186.82411 209.33490 400.9273
#  expand_grid_apply 296.51983 373.41923 405.19549 403.36825 427.41728 597.6937

lapply: Apply a Function over a List or Vector, VALUE) rows and length(X) columns, otherwise an array a with dim(a) == c(dim(​FUN.VALUE), length(X)) . The (Dim)names of the array value are taken from the  For each Row in an R Data Frame. To call a function for each row in an R data frame, we shall use R apply function. apply ( data_frame, 1, function, arguments_to_function_if_any) The second argument 1 represents rows, if it is 2 then the function would apply on columns. Following is an example R Script to demonstrate how to apply a function for each row in an R Data Frame.

That's not how apply works: You cannot access the current index (row, col index) from inside [lsvm]?apply-family.

You will have to create the current row and col index before applying. ?expand.grid.

mymatrix <- matrix(1:12, nrow=3, ncol=4)
newResult<- mymatrix

grid1 <- expand.grid(1:nrow(mymatrix),1:ncol(mymatrix))

newResult[]<-
mapply(function(row_number, col_number){ sum(mymatrix[row_number, col_number], row_number, col_number) },row_number = grid1$Var1, col_number = grid1$Var2 )

newResult

#     [,1] [,2] [,3] [,4]
#[1,]    3    7   11   15
#[2,]    5    9   13   17
#[3,]    7   11   15   19

If you want to use apply

newResult[]<-    
apply(grid1, 1, function(x){ sum(mymatrix[x[1], x[2]], x[1], x[2]) })

Access individual elements of a row while using the apply function , The apply function in R is a huge work-horse for me across many projects. Usually, I use it to make aggregations of a targeted group of columns for Access individual elements of a row while using the apply function on  The apply () function splits up the matrix in rows. Remember that if you select a single row or column, R will, by default, simplify that to a vector. The apply () function then uses these vectors one by one as an argument to the function you specified. So, the applied function needs to be able to deal with vectors.

This is my thought with outer() function.

The third argument FUN can be any two-argument function.

mymatrix <- matrix(1:12, nrow = 3, ncol = 4)
nr <- nrow(mymatrix)
nc <- ncol(mymatrix)
mymatrix + outer(1:nr, 1:nc, FUN = "+")

     [,1] [,2] [,3] [,4]
[1,]    3    7   11   15
[2,]    5    9   13   17
[3,]    7   11   15   19

With @Maurits Evers' benchmark code :

Unit: microseconds
     expr       min         lq      mean    median        uq        max
 for_loop 19963.203 22427.1630 25308.168 23811.855 25017.031 158341.678
    outer   848.247   949.3515  1054.944  1011.457  1059.217   1463.956

In addition, I try to complete your original idea with apply(X, c(1,2), function (x)) :

(It's a little slower than other answers)

mymatrix <- matrix(1:12, nrow = 3, ncol = 4)
n <- 1                                        # n = index of data
nr <- nrow(mymatrix)
apply(mymatrix, c(1,2), function (x) {
  row_number <- (n-1) %% nr + 1               # convert n to row number
  col_number <- (n-1) %/% nr + 1              # convert n to column number
  res <- sum(x, row_number, col_number)
  n <<- n + 1
  return(res)
})

     [,1] [,2] [,3] [,4]
[1,]    3    7   11   15
[2,]    5    9   13   17
[3,]    7   11   15   19

Using apply in R to extract rows from a dataframe, Maybe the following does it. Note that there are two lapply based loop, in order to predict for changes in the values of column Name . Row wise Function in python pandas : Apply() apply() Function to find the mean of values across rows. #row wise mean print df.apply(np.mean,axis=1) so the output will be . Column wise Function in python pandas : Apply() apply() Function to find the mean of values across columns. #column wise meanprint df.apply(np.mean,axis=0) so the output will be

Apply function for each Row in an R Data Frame, To call a function for each row in an R data frame, we shall use R apply function. The second argument 1 represents rows, if it is 2 then the function would apply on columns. Following is an example access element in second column. income = x[ Extract Substring from a String in R R Dataframe - Replace NA with 0. First, I set up the sample data. It’s obviously pretty simple, and the dataframe I used in reality had some rows where there were all NA values throughout the first 5 columns. It’s in the apply function where the real magic begins. First, we load up all relevant columns into the apply functions for each row (test[,1:6]).

Chapter 4: apply Functions, 2.1.1 Example 1: Using apply to find row sums; 2.1.2 Example 2: Creating a function in my.matrx is a matrix with 1-10 in column 1, 11-20 in column 2, and 21-30 in If your data is a vector you need to use lapply, sapply, or vapply instead. In this article we will discuss how to apply a given lambda function or user defined function or numpy function to each row or column in a dataframe. Python’s Pandas Library provides an member function in Dataframe class to apply a function along the axis of the Dataframe i.e. along each row or column i.e.

The Apply Family of Functions, R is bad at looping. A more vectorized way to do this is to use the apply() function​. The 2 means "go by column" -- a 1 would have meant "go by row." Of course tapply() returns a vector with one element for each unique value of barley$site. The regular apply() function can be used on a data frame since a data frame is a type of matrix. When you use it on the columns of a data frame, passing the number 2 for the second argument, it does what you expect. It will work on the rows of a data frame, too, but remember: apply extracts each row as a vector, one at a time.

Comments
  • I think this approach is much clearer for a beginner.
  • @AndreElrico Perhaps, I went for the mapply/Map approach first as well though;-) I was curious how both would compare so added a microbenchmark comparison.
  • thanks for the benchmark, i learned something. I would be currious about how it would look like if the expand.grid is outside of the benchmark.
  • double for-loop. Still a thing in 2018
  • @MaMu Yes, I'm not so sure why this is, to be honest. I thought at first it might be because in double_sapply you're iterating through columns first and then rows, whereas in the for loop I iterate through rows first then columns. But even swapping the order doesn't really change the result. It must be the overhead that sapply brings, perhaps the implicit simplify = TRUE.