Using a vector of logical expressions to subset dataframe in R

r subset dataframe by list of values
r subset list
extract values from vector in r
subset vector r
r subset dataframe by column value
r subset(data frame multiple conditions)
r subset matrix by column names
subset list of dataframes r

I am trying to use a vector of logical expressions to subset a data frame. As a simple example, here is a data frame that I will subset using a logical expression. First, I'll type in the logical expression manually:

> dat <- data.frame(x = c(0,0,1,1), y = c(0,0,0,1), z = c(0,1,1,1))
> dat
  x y z
1 0 0 0
2 0 0 1
3 1 0 1
4 1 1 1
> subset(dat, x == 1)
  x y z
3 1 0 1
4 1 1 1

If I have a vector of logical expressions, how can I call from that vector and apply them to a subsetting method? Here is one way that doesn't work:

> criteria <- as.factor(c("x == 1", "y == 1", "y == 1 & z == 1"))
> subset(dat, criteria[1])
Error in subset.data.frame(dat, criteria[1]) : 'subset' must be logical

Any suggestions?

We can use parse and eval to evaluate the condition as a vector of string.

criteria <- c("x == 1", "y == 1", "y == 1 & z == 1")

subset(dat, eval(parse(text = criteria)))
#   x y z
# 4 1 1 1

We can use index to select the element in the criteria vector to subset the data frame.

subset(dat, eval(parse(text = criteria[1])))
#   x y z
# 3 1 0 1
# 4 1 1 1

Everything I know about R subsetting, How do you get a subset of a vector in R? For ordinary vectors, the result is simply x[subset & !is.na(subset)]. For data frames, the subset argument works on the rows. Note that subset will be evaluated in the data frame, so columns can be referred to (by name) as variables in the expression (see the examples). The select argument exists only for the methods for data frames and matrices. It works by first replacing column names in the selection expression with the corresponding column numbers in the data frame and then using the

You can't make an atomic vector of vectors, so you can contain them in a list. Subsetting with [, which is easier to program with than subset,

dat <- data.frame(x = c(0,0,1,1), 
                  y = c(0,0,0,1), 
                  z = c(0,1,1,1))

indices <- list(dat$x == 1, 
                dat$y == 1, 
                dat$x == 1 & dat$z == 1)

str(indices)
#> List of 3
#>  $ : logi [1:4] FALSE FALSE TRUE TRUE
#>  $ : logi [1:4] FALSE FALSE FALSE TRUE
#>  $ : logi [1:4] FALSE FALSE TRUE TRUE

dat[indices[[1]], ]
#>   x y z
#> 3 1 0 1
#> 4 1 1 1

lapply(indices, function(i) dat[i, ])
#> [[1]]
#>   x y z
#> 3 1 0 1
#> 4 1 1 1
#> 
#> [[2]]
#>   x y z
#> 4 1 1 1
#> 
#> [[3]]
#>   x y z
#> 3 1 0 1
#> 4 1 1 1

Using a vector of logical expressions to subset dataframe in R , We can use parse and eval to evaluate the condition as a vector of string. criteria <- c("x == 1", "y == 1", "y == 1 & z == 1") subset(dat,  If you want to subset rows and keep all columns you have to use the specification object[index_rows, index_columns], while index_cols can be left blank, which will use all columns by default. However, you still need to include the , to indicate that you want to get a subset of rows instead of a subset of columns.

With a vector of logicals:

critearia <- dat$x == 1 & dat$y == 1 & dat$z == 1
subset(dat, critearia)

Directly:

subset(dat, x == 1 & y == 1 & z == 1)

With data.table:

library(data.table)
dat  <- as.data.table(dat)
dat[x == 1 & y == 1 & z == 1]

4 Subsetting, Subsetting operators interact differently with different vector types (e.g., atomic a matrix or data frame with a single number, a single name, or a logical vector  For ordinary vectors, the result is simply x[subset & !is.na(subset)]. For data frames, the subset argument works on the rows. Note that subset will be evaluated in the data frame, so columns can be referred to (by name) as variables in the expression (see the examples).

YaRrr! The Pirate's Guide to R, When you index a vector with a logical vector, R will return values of the vector to create logical vectors from existing vectors using comparison operators like  The most basic way of subsetting a data frame in R is by using square brackets such that in: example[x,y] example is the data frame we want to subset, ‘x’ consists of the rows we want returned, and ‘y’ consists of the columns we want returned. Let’s pull some data from the web and see how this is done on a real data set.

Subsetting atomic vectors, Lastly, you can subset with a logical vector. This is a bit different: it Error in food​$ham: $ operator is invalid for atomic vectors. $ is basically a To get the first two rows and the first four columns of this data frame, we can do: With multiple vectors. With a single vector. With a matrix. The most common way of subsetting matrices (2d) and arrays (>2d) is a simple generalisation of 1d subsetting: you supply a 1d index for each dimension, separated by a comma. Blank subsetting is now useful because it lets you keep all rows or all columns.

subset: Subsetting Vectors, Matrices and Data Frames, R has many powerful subset operators and mastering them will allow you to easily perform Subsetting atomic vectors Using logical operators 1 3 a # 3 3 1 c # There are two ways to select columns from a data frame # Like a list: df[c("​x",  Using multiple criteria in subset function and logical operators and only examines the first element in the vector. Using a list to subset a large dataframe. 0.

Comments
  • Can you provide an example of exactly what you want output to look like? Having a hard time understanding exactly what you are trying to do. Thanks :)
  • A slightly better way if evaluation needs to be delayed is to store expressions instead of strings: criteria <- list(quote(x == 1), quote(y == 1), quote(y == 1 & z == 1)); lapply(criteria, function(c) subset(dat, eval(c))) That way syntax errors will be caught earlier instead of causing headaches far from where they originated.
  • We can also evaluate criteria separately with eval(parse(text=as.character(criteria[1])),envir = dat), and filter afterward.
  • @allistaire and then subset(dat, eval(criteria[[1]])) right ? Can we get from OP's character vector to your quote vector ?
  • @Moody_Mudskipper lapply(criteria, function(c) parse(text = c)) is simplest, though there's a difference in the class of language object returned
  • This is what I am looking for. Thank you.