using one_of/vars inside of a function in mutate_all

mutate_all example
mutate_if multiple conditions
mutate_all replace na
mutate_all except one column
mutate_at ifelse
mutate_if r documentation
mutate_if column name
mutate_at case_when

I need to subtract all columns of a data set by one of the columns. The name of the column I need is dynamic, stored outside of the data set, and represented as c("a") below.

dataset <- data.frame(a = c(0.021, 0.011, -0.031, -0.021, -0.041, 0.061), 
                      b = c(0.022, 0.012, -0.032, -0.022, -0.042, 0.062), 
                      c = c(0.010, 0.000, -0.020, 0.010, -0.030, 0.070))
dataset %>% mutate_all(funs( (. - one_of(c("a"))) ))

When I run this the resulting error is Evaluation error: Variable context not set. I know this must be related to calling one_of() inside of funs(). A slightly less elegant solution is:

dataset - dataset %>% select(one_of("a")) %>% pull

Nevertheless, I'm curious why I can't do the former.

In order to dynamically choose the column to subtract, you'll need to use tidyeval.

One way to write such a function: First create a quosure of your subtraction column with enquo, then use this to select the column to subtract inside mutate_all. The bit .[[quo_name(col_quo)]] is the tidyeval equivalent to .$a, which you might have used if you were using a set column.

library(dplyr)

dataset <- data.frame(a = c(0.021, 0.011, -0.031, -0.021, -0.041, 0.061), 
                      b = c(0.022, 0.012, -0.032, -0.022, -0.042, 0.062), 
                      c = c(0.010, 0.000, -0.020, 0.010, -0.030, 0.070))

subtract_col <- function(data, col) {
  col_quo <- enquo(col)

  data %>%
    mutate_all(function(x) x - .[[quo_name(col_quo)]])
}

subtract_col(dataset, a)
#>   a      b      c
#> 1 0  0.001 -0.011
#> 2 0  0.001 -0.011
#> 3 0 -0.001  0.011
#> 4 0 -0.001  0.031
#> 5 0 -0.001  0.011
#> 6 0  0.001  0.009

subtract_col(dataset, c)
#>        a      b c
#> 1  0.011  0.012 0
#> 2  0.011  0.012 0
#> 3 -0.011 -0.012 0
#> 4 -0.031 -0.032 0
#> 5 -0.011 -0.012 0
#> 6 -0.009 -0.008 0

Created on 2018-07-31 by the reprex package (v0.2.0).

mutate_all(), select_if(), summarise_at() what's the deal with , A quick useful aside: Using shorthand for functions; The _if() scoped the variable names inside the vars() function as the first argument. one_of() : selects variables that match any entries in the specified character vector. The scoped variants of mutate () and transmute () make it easy to apply the same transformation to multiple variables. There are three variants: A function fun, a quosure style lambda ~ fun (.) or a list of either form. Additional arguments for the function calls in .funs. These are evaluated only once, with tidy dots support.

You could do this:

dataset %>% mutate_all(`-`,.$a)
#   a      b      c
# 1 0  0.001 -0.011
# 2 0  0.001 -0.011
# 3 0 -0.001  0.011
# 4 0 -0.001  0.031
# 5 0 -0.001  0.011
# 6 0  0.001  0.009

Or similar to @Miha's comment:

dataset %>% transmute_all( funs(new=. - a))
#   a_new  b_new  c_new
# 1     0  0.001 -0.011
# 2     0  0.001 -0.011
# 3     0 -0.001  0.011
# 4     0 -0.001  0.031
# 5     0 -0.001  0.011
# 6     0  0.001  0.009

I we skip the new=, a is first subtracted from itself and can't be used for the other variables (thanks @aosmith).

Mutate multiple columns, _if affects variables selected with a predicate function: a function is the prefix "​fn" followed by the index of this function within the unnamed functions in the list. Mutate Function in R (mutate, mutate_all and mutate_at) is used to create new variable or column to the dataframe in R. Dplyr package in R is provided with mutate(), mutate_all() and mutate_at() function which creates the new variable to the dataframe.

Camille's answer had all the pieces to get there, however, if you are looking for a one-liner down the road:

dataset %>% mutate_all(funs((. - !!rlang::sym(c("a")))))

Data Wrangling Part 2: Transforming your columns into the right shape, Mutate all; Mutate if; Mutate at to change specific columns; Changing column You can make new columns with the mutate() function. One of the simplest options is a calculation based on values in other columns. In the mutate_at() requires you to specify columns inside a vars() argument for which the  Is there a single-call way to assign several specific columns to a value using dplyr, based on a condition from a column outside that group of columns? My issue is that mutate_if checks for conditions on the specific columns themselves, and mutate_at seems to limit all references to just those same specific columns. Whereas I want to mutate based on a corresponding value in a column outside

Data Wrangling Part 1: Basic to Advanced Ways to Select Columns, I went through the entire dplyr documentation for a talk last week about into the right shape, which covers functions such as mutate_all() Observations: 83 ## Variables: 4 ## $ name <chr> "Cheetah", "Owl There is another option which avoids the continuous retyping of columns names: one_of() . The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names..vars: A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL..cols

How to manipulate variables with dplyr in R (part 1), Renaming a single variable; Renaming multiple variables Quick tip #1: to use a particular function within a package you can num_range(); one_of() The dplyr::mutate_all() is handy if you want to mutate all variables in a  I know this must be related to calling one_of() inside of funs(). A slightly less elegant solution is: A slightly less elegant solution is: dataset - dataset %>% select(one_of("a")) %>% pull

[PDF] dplyr, vars() for other quoting functions that you can use with scoped verbs. arrange case_when is particularly useful inside mutate when you want to The three scoped variants of mutate() (mutate_all(), mutate_if() and mutate_at()) and the one_of(). • everything(). • group_cols(). To drop variables, use -. The mutate() function is a function for creating new variables. Essentially, that’s all it does. Like all of the dplyr functions, it is designed to do one thing. How to use mutate in R. Using mutate() is very straightforward. In fact, using any of the dplyr functions is very straightforward, because they are quite well designed. When you use

Comments
  • You could also try this dataset %>% mutate_at(vars(-one_of(c("a"))), funs(new = . - a))
  • Thanks Miha, but "a" will change, so I can't refer to it directly in the code. Thus funs(new = . - a) will get me in trouble.
  • Thank you, very helpful.
  • It's because a is the first variable subtracted from itself and so this mutated version (which is all 0) replaces the original a for the rest of the variables.
  • Thanks Moody, but my problem is that I can't refer to 'a' in the code since it will change. In fact, maybe it would be clearer if I changed c("a") to mySelection <- c("a").