How to use lapply to transform specific values in a list of dataframes

How to use lapply to transform specific values in a list of dataframes

r apply dataframe
apply a function to a list of dataframes in r
r lapply custom function
r apply function to data frame columns
r apply function with multiple arguments to data frame
lapply r
mapply
r apply custom function to each row

I'm looking for help to transform a for loop into an lapply or similar function.

I have a list of similar data.frames, each containing

  • an indicator column ('a')
  • a value column ('b')

I want to invert the values in column b for each data frame, but only for specific indicators. For example, invert all values in 'b' that have an indicator of 2 in column a.

Here are some sample data:

x = data.frame(a = c(1, 2, 3, 2),  b = (seq(from = .1, to = 1, by = .25)))
y = data.frame(a = c(1, 2, 3, 2),  b = (seq(from = 1, to = .1, by = -.25)))
my_list <- list(x = , y = y)

my_list
$x
  a    b
1 1 0.10
2 2 0.35
3 3 0.60
4 2 0.85

$y
  a    b
1 1 1.00
2 2 0.75
3 3 0.50
4 2 0.25

My desired output looks like this:

my_list
$x
  a    b
1 1 0.10
2 2 0.65
3 3 0.60
4 2 0.15

$y
  a    b
1 1 1.00
2 2 0.25
3 3 0.50
4 2 0.75

I can achieve the desired output with the following for loop.

for(i in 1:length(my_list)){
    my_list[[i]][my_list[[i]]['a'] == 2, 'b'] <-
        1 - my_list[[i]][my_list[[i]]['a'] == 2, 'b']
}

BUT. When I try to roll this into lapply form like so:

    invertfun <- function(inputDF){
    inputDF[inputDF['a'] == 2, 'b'] <- 1 - inputDF[inputDF['a'] == 2, 'b']
    }
resultList <- lapply(X = my_list, FUN = invertfun)

I get a list with only the inverted values:

resultList
$x
[1] 0.65 0.15

$y
[1] 0.25 0.75

What am I missing here? I've tried to apply (pun intended) the insights from:

how to use lapply instead of a for loop, to perform a calculation on a list of dataframes in R

I'd appreciate any insights or alternative solutions. I'm trying to take my R skills to the next level and apply and similar functions seem to be the key.


We could use lapply to loop over each list and change the b column based on value in a column.

my_list[] <- lapply(my_list, function(x) transform(x, b = ifelse(a==2, 1-b, b)))

my_list
#[[1]]
#  a    b
#1 1 0.10
#2 2 0.65
#3 3 0.60
#4 2 0.15

#[[2]]
#  a    b
#1 1 1.00
#2 2 0.25
#3 3 0.50
#4 2 0.75

The same could be done using map from purrr

library(purrr)
map(my_list, function(x) transform(x, b = ifelse(a==2, 1-b, b)))

lapply on list of data frames in specific columns, I'm not quite sure what you want to do, but this should help. Create a function that fixes a single data frame fix_data_frame = function(x)  1. try this: lapply (listDF, function(x) { names(x) [-1] <- todos[-length(x)] x }) you will get a new list with changed dataframes. If you want to manipulate the listDF directly: for (i in 1:length(listDF)) names(listDF[ [i]]) [-1] <- todos[-length(listDF[ [i]])] share. Share a link to this answer. Copy link.


See Ronak's answer above for a fairly elegant solution using transform() or map(), but for those who are following in my footsteps, my original solution would work if I added a line in the custom function to return the full data frame like so:

invertfun <- function(inputDF){
    inputDF[inputDF['a'] == 2, 'b'] <- 1 - inputDF[inputDF['a'] == 2, 'b']
return(inputDF)    
}

resultList <- lapply(X = my_list, FUN = invertfun)

UPDATE - On further testing, this solution throws an Error in x[[jj]][iseq] <- vjj : replacement has length zero when the desired 'a' value doesn't exist in one of the data frames. So best not to go down this road and use the accepted answer above.

R tutorial on the Apply family of functions, They act on an input list, matrix or array and apply a named function with one or Other transforming or subsetting functions; and; Other vectorized functions, R can return a value even if the latter has not been specified, or more It can be used for other objects like dataframes, lists or vectors; and; The  lapply (X, FUN) Arguments: -X: A vector or an object -FUN: Function applied to each element of x. l in lapply () stands for list. The difference between lapply () and apply () lies between the output return. The output of lapply () is a list. lapply () can be used for other objects like data frames and lists.


lapply is typically not the best way to iteratively modify a list. lapply is going to generate a loop internally in any case, so usually easier to read if you do something more explicit:

for (i in seq_along(my_list)) {
    my_list[[i]] <- within(my_list[[i]], {
        b[a==2] <- 1 - b[a==2]
    })}

If we replace within with with in the example above, we get the output from your initial solution, i.e. lapply(X = my_list, FUN = invertfun).

That is, instead of modifying the list in place the latter solutions replace the list elements with new vectors.

Using lists of data frames in complex analyses, 'data.frame': 1992 obs. of 8 variables: ## $ X : int 1 2 3 4 5 6 7 8 9 10 . "​expression" must be a variable name for the expression level of a particular gene​. very little change in expression across the samples, so aren't very interesting for this purpose. iqr_vals <- apply(expr_matrix_Metabric, 1, IQR) summary(​iqr_vals) By Andrie de Vries, Joris Meys . When your data is in the form of a list, and you want to perform calculations on each element of that list in R, the appropriate apply function is lapply().


4 data wrangling tasks in R for advanced beginners, To add a profit margin column to our data frame with transform() we'd use: One brief aside about round(): You can use negative numbers for the second, Because apply() will try to sum every item per row, and company names can't be summed. How can we pass multiple items to apply() in a certain order for use in an  In our case, the variables of interest are stored in columns 3 through 8 of our data frame. So we can use lapply() to go through the numbers 3 through 8 and do the same thing each time. The hardest part of using lapply() is writing the function that is to be applied to each piece. We need to write our own function for lapply() to use.


Repeating things: looping and the apply family, Then you then Split it up into many smaller datasets, Apply a function to It's obvious what the loop does, and no new variables are created. We now have a list, where each element is a data.frame of weather data: you can change the implementation detail without the rest of the program changing. Default value 0. If value is 0 then it applies function to each column. If value is 1 then it applies function to each row. args : tuple / list of arguments to passed to function. Let’s use this to apply function to rows and columns of a Dataframe.


Dealing with Apply functions in R, allow repetition of instructions for several numbers of times. In this post, I am going to discuss the efficiency of apply functions to different data structures like list, matrix, array, data frame etc. The output object type depends on the input object and the function specified. apply() can return a vector, list,  If you want the filtered dataframes to be stored in a list assigned to the same variable, just overwrite the original list variable. – yeedle Mar 19 '17 at 17:11 add a comment | Your Answer