How to subset a list containing several lists in R, based on the column name, and merge into a single list/dataframe?

r subset dataframe by list of values
r subset dataframe by column value
r subset matrix by column names
r subset list
r subset(data frame multiple conditions)
r subset dataframe by column name
subset in r
subset vector r

I have several lists with the same column names. I am trying to write a function to subset by column name, from all the lists, merge into one dataframe and add new column names. For example:

list1<- list(a= c(1:6),
      b= c(4:9),
      c= c(3:8))
list2<- list(a= c(12:17),
      b= c(10:15),
      c= c(11:16))
list3<- list(a= c(2:7),
      b= c(14:19),
      c= c(9:14))
all<- list (list1, list2, list3)
new_column_names<- c("Block1", "Block2", "Block3")

I would like to subset all lists "a" and merge into a single dataframe with new_column_names as column names. Any suggestions? Thanks!

If you want a data.frame you will have to make the vectors a all of equal length first. Then cbind the results.

res <- lapply(all, `[[`, "a")
n <- max(sapply(res, length))
res <- lapply(res, function(x) if(length(x) < n) c(x, rep(NA, n - length(x))) else x)
res <- do.call(cbind, res)
res <- as.data.frame(res)
res <- setNames(res, new_column_names)
res
#  Block1 Block2 Block3
#1      1     12      2
#2      2     13      3
#3      3     14      4
#4      4     15      5
#5      5     16      6
#6     NA     17      7

R Merge - How To Merge Two R Data Frames, How do I merge two data frames in the same column in R? How to subset a list containing several lists in R, based on the column name, and merge into a single list/dataframe? Ask Question Asked 1 year, 10 months ago

You cannot turn them into a regular data.frame because those a-vectors have different lengths and a data.frame requires all columns to have equal length.

Instead, you can turn them into a long-formatted data.frame using:

stack(setNames(lapply(all, `[[`, "a"), new_column_names))
#    values    ind
# 1       1 Block1
# 2       2 Block1
# 3       3 Block1
# 4       4 Block1
# 5       5 Block1
# 6      12 Block2
# 7      13 Block2
# 8      14 Block2
# 9      15 Block2
# 10     16 Block2
# 11     17 Block2
# 12      2 Block3
# 13      3 Block3
# 14      4 Block3
# 15      5 Block3
# 16      6 Block3
# 17      7 Block3

Populating data frame cells with more than one value, type the following to change the values in a single cell. The subset () function takes 3 arguments: the data frame you want subsetted, the rows corresponding to the condition by which you want it subsetted, and the columns you want returned. In our case, we take a subset of education where “Region” is equal to 2 and then we select the “State,” “Minor.Population,” and “Education

You can try

library(tidyverse)
all %>% 
  flatten() %>%
  keep(names(.) =="a") %>%
  set_names(new_column_names) %>% 
  map(~tibble(a=.x, n=seq_along(.x))) %>% 
  bind_rows(.id = "ind") 
# A tibble: 17 x 3
   ind        a     n
   <chr>  <int> <int>
 1 Block1     1     1
 2 Block1     2     2
 3 Block1     3     3
 4 Block1     4     4
 5 Block1     5     5
 6 Block2    12     1
 7 Block2    13     2
 8 Block2    14     3
 9 Block2    15     4
10 Block2    16     5
11 Block2    17     6
12 Block3     2     1
13 Block3     3     2
14 Block3     4     3
15 Block3     5     4
16 Block3     6     5
17 Block3     7     6

Then you can spread for instance to get the data.frame

.Last.value %>% 
  spread(ind, a)
# A tibble: 6 x 4
      n Block1 Block2 Block3
  <int>  <int>  <int>  <int>
1     1      1     12      2
2     2      2     13      3
3     3      3     14      4
4     4      4     15      5
5     5      5     16      6
6     6     NA     17      7

4 Subsetting, What is the result of subsetting a vector with positive integers, negative 4.2.2 Lists. Subsetting a list works in the same way as subsetting an atomic vector. By default, subsetting a matrix or data frame with a single number, a single name, or a your code with a data frame or matrix with multiple columns, and it works. The subset function with a logical statement will let you subset the data frame by observations. In the following example the write.50 data frame contains only the observations for which the values of the variable write is greater than 50. Note that one convenient feature of the subset function, is R assumes variable names are within the data

Here is a modified option using tidyverse

library(tidyverse)
map(all, ~ as_tibble(.x) %>%
               select(a)) %>%
      set_names(new_column_names) %>% 
      bind_rows(.id = 'ind')
# A tibble: 18 x 2
#   ind        a
#   <chr>  <int>
# 1 Block1     1
# 2 Block1     2
# 3 Block1     3
# 4 Block1     4
# 5 Block1     5
# 6 Block1     6
# 7 Block2    12
# 8 Block2    13
# 9 Block2    14
#10 Block2    15
#11 Block2    16
#12 Block2    17
#13 Block3     2
#14 Block3     3
#15 Block3     4
#16 Block3     5
#17 Block3     6
#18 Block3     7

Or using map2

map2_df(all, new_column_names,
                 ~ as_tibble(.x) %>% 
                        mutate(ind = .y) %>%
                        select(ind, a))

Subsetting · Advanced R., What is the result of subsetting a vector with positive integers, negative integers, You'll then learn how those six data types act when used to subset lists, Character vectors to return elements with matching names. With multiple vectors. c # There are two ways to select columns from a data frame # Like a list: df[c("x",  The subset ( ) function is the easiest way to select variables and observations. In the following example, we select all rows that have a value of age greater than or equal to 20 or age less then 10. We keep the ID and Weight columns. # using subset function. newdata <- subset (mydata, age >= 20 | age < 10, select=c (ID, Weight))

In base R:

as.data.frame(setNames(lapply(all,`[[`,"a"),new_column_names))
#   Block1 Block2 Block3
# 1      1     12      2
# 2      2     13      3
# 3      3     14      4
# 4      4     15      5
# 5      5     16      6
# 6      6     17      7

Data wrangling: dataframes, matrices, and lists, Demonstrate how to subset, merge, and create new datasets from existing data metadata[1, 1] # element from the first row in the first column of the data frame metadata[1, The $ allows you to select a single column by name. notation, even though in theory a list is a vector (that contains multiple data structures). There are many situations in R where you have a list of vectors that you need to convert to a data.frame. This question has been addressed over at StackOverflow and it turns out there are many different approaches to completing this task. Since I encou

Data Wrangling in R: Combining, Merging and Reshaping Data, This is a list containing 7 data frames of stock # data for 7 different companies. names(allStocks) We can use rbind to combine these into one data frame. FALSE)) # Recall that c() combines values into vectors OR lists. is.vector( c(1,2,3) ) When merging two data frames that do not have matching column names, we​  Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Creating Pandas Dataframe can be achieved in multiple ways. Let’s see how can we create a Pandas DataFrame from Lists.

R Data Frame: Create, Append, Select, Subset, A data frame is a list of vectors which are of equal length. A matrix contains only one type of data, while a data frame accepts different data types (numeric, character, We can change the column name with the function names(). It is possible to subset based on whether or not a certain condition was true. Merge DataFrame or named Series objects with a database-style join. The join is done on columns or indexes. If joining columns on columns, the DataFrame indexes will be ignored. Otherwise if joining indexes on indexes or indexes on a column or columns, the index will be passed on. right : DataFrame or named Series. Object to merge with.

15 Easy Solutions To Your Data Frame Problems In R, However, it's a list with vector structures of the same length. as special types of lists and can be accessed as either a matrix or a list. Data frames are particularly handy to store multiple data vectors, How To Change A Data Frame's Row And Column Names Define the subset with variable names. Pandas: Convert a dataframe column into a list using Series.to_list() or numpy.ndarray.tolist() in python How to get & check data types of Dataframe columns in Python Pandas Pandas : Change data type of single or multiple columns of Dataframe in Python

Comments
  • There seems to be some typos. I guess you mean list1 <- list(...) rather than list1 <- c(...).
  • Thanks @mt1022! I corrected the error.
  • Sorry, I meant to have the same row numbers. Any other ideas?
  • @AdelaIliescu What does that mean? Please be more precise. Also, you should provide the desired result in your question