Conditional mutating with regex in dplyr using rowSum

dplyr filter %in%
dplyr::filter regex
dplyr filter(str_detect)
r select rows containing string
dplyr filter contains
dplyr grepl
dplyr contains
remove rows in r based on condition dplyr

Let's say I have the following df

test = read.table(text = "total_score_1 total_score_2 partner_total_score_1 total_score_3 total_score_4 letter
                  1 -1 1 1 -1 B
                  1 1 1 -1 1 C
                  -1 -1 -1 -1 1 A", header = T)

I'd like to match all columns that contain "total_score" but NOT the word "partner" and then create a new measure that sums across all "total_score" columns, treating the -1 as 0.

I'm able to take a basic rowSum like so

mutate(net_correct = rowSums(select(., grep("total_score", names(.))))

Note, however, this does not exclude the possibility of matching the word "partner", which I wasn't able to find out how to do within a single grep command.

However, I'd now like to create a total_correct value which is a rowSum on the same columns except the -1s are treated as 0s.

This would result in a data.frame like so:

  total_score_1 total_score_2 partner_total_score_1 total_score_3 total_score_4 letter total_sum
1             1            -1                     1             1            -1      B         2
2             1             1                     1            -1             1      C         3
3            -1            -1                    -1            -1             1      A         1

One way might be to just count the total number of "1s" (rather than actually doing a sum), but I could not figure out how to do so within a mutate command

You could do:

test %>% 
mutate(net_correct = select(.,setdiff(contains("total_score"), contains("partner"))) %>%  replace(., . == -1, 0) %>%  rowSums())

#  total_score_1 total_score_2 partner_total_score_1 total_score_3 total_score_4 letter net_correct
#1             1            -1                     1             1            -1      B           2
#2             1             1                     1            -1             1      C           3
#3            -1            -1                    -1            -1             1      A           1

Filter, Piping and GREPL Using R DPLYR, Filter, Piping and GREPL Using R DPLYR - An Intro mutate() (and transmute() ) number of conditional statements on the named columns of the data object of the suite of Regular Expressions functions. grepl uses regular expressions to� Inside the self-referencing group ((?(1)\1)B), instead of using a conditional, we could use an optional (but possessive) back-reference to Group 1 \1?+. If Group 1 is set, it is matched. And the possessive + forbids the engine from backtracking and giving up the back-reference. We've already looked at the recursive solution.

You could simply modify your regex to capture only columns that start with 'total_score' using the caret character:

mutate(net_correct = rowSums(select(., grep("^total_score", names(.)))))

To treat the negative numbers as zero, you can use mutate_all():

test %>%
  mutate(total_correct = rowSums(select(., grep("^total_score", names(.))) %>% 
                                 mutate_all(function(x){as.numeric(x>0)})
                              )
  )

Rowsums r dplyr, We can retrieve earlier values by using the lag() function from dplyr [1]. columns into a new column (oiddate) using the mutate() function in the dplyr package. However, I want rowSums conditional: 1) if there is at least one value non NA in a I would create a list of all your matrices using mget and ls (and some regex� Essentially, that’s all it does. Like all of the dplyr functions, it is designed to do one thing. How to use mutate in R. Using mutate() is very straightforward. In fact, using any of the dplyr functions is very straightforward, because they are quite well designed. When you use mutate(), you need typically to specify 3 things:

Another possibility could be:

test %>%
 mutate(net_correct = rowSums(select(., contains("total"), -contains("partner")) %>% 
                               replace(., . == -1, 0)))

  total_score_1 total_score_2 partner_total_score_1 total_score_3 total_score_4
1             1            -1                     1             1            -1
2             1             1                     1            -1             1
3            -1            -1                    -1            -1             1
  letter net_correct
1      B           2
2      C           3
3      A           1

Some tricks on dplyr::filter – Sebastian Sauer Stats Blog, The R package dplyr has some attractive features; some say, this go wild, making use of the whole string manipulation magic, called Regex . Is there a single-call way to assign several specific columns to a value using dplyr, based on a condition from a column outside that group of columns? My issue is that mutate_if checks for conditions on the specific columns themselves, and mutate_at seems to limit all references to just those same specific columns. Whereas I want to mutate based on a corresponding value in a column outside

How do I concatenate columns whose column name matches , (Conditional Concatenation) Say a way to create a new column via dplyr:: mutate() by applying paste() Thank you, rowSums to the rescue! Filter or subsetting rows in R using Dplyr can be easily achieved. Dplyr package in R is provided with filter() function which subsets the rows with multiple conditions. We will be using mtcars data to depict the example of filtering or subsetting. Filter or subsetting the rows in R using Dplyr: Subset using filter() function.

Data manipulation in R, Note when this option is there, you cannot use regular expression. line_want <- grep("Number of ID.wide$numb.twins <- rowSums(!is.na(df.twin. Create new column with dplyr mutate and substring of existing column Conditionally create a new column with mutate(new.var=case_when) Values of new variable are� Mutate Function in R using dplyr – mutate, mutate_all and mutate_at – Create new variable Mutate Function in R (mutate, mutate_all and mutate_at) is used to create new variable or column to the dataframe in R. Dplyr package in R is provided with mutate(), mutate_all() and mutate_at() function which creates the new variable to the dataframe.

r, 我可以像这样进行基本的rowSum 但是请注意,这并不排除匹配单词partner 的可能性, Conditional mutating with regex in dplyr using rowSum. Hi, I want to write a function that is given a named list which is then passed on to mutate() in a way that each element of the list is an argument to mutate().I cannot get this right, either with the new quotation/quasi-quotation syntax or with the old mutate_() and would appreciate some help.

Comments
  • Woah. News to me that you can have pipes within pipes. That's amazing
  • Thank you! I think you missed the more important part of the question which is to create a total_sum variable in which the negative values are treated as 0s. Also in my actual df there are random prefixes in front of "total_score", so that grep won't actually do the trick In the past, I was using something like index <- grepl("total_score", names(data)) & !grepl('partner', names(data)) but it was clumsy
  • I updated it relative to your test data. You need mutate_all. For selecting the right columns, the strategy really depends on the structure of your data - if you have a few columns, it might be least hassle just to select them with a range - either to positively select the correct columns or to remove the partner columns, e.g. using select(.,-c(3,6))