Remove specific rows according a text in R

remove rows in r by row number
remove rows in r based on condition
remove specific rows from dataframe in r
r remove rows with certain values in r
delete rows in r
r remove rows with na in one column
remove rows with special characters in r
remove rows in r by name

I have a small issue regarding removing specific rows. In this example, I would like to remove the rows from the word "5055" in the column "power" until the word "Exer" in the column "fr". Importantly, I would like to apply this function in both id (Here, LM01-PRD-S1 and LB02-PRD-S1).

                   time   power        hr     fr          id

 1                  <NA>  5055       Zoti      E LM01-PRD-S1
 2              747 mmHg  <NA>       09/0   2016 LM01-PRD-S1
 3 9.7222222222222224E-3     0         76     20 LM01-PRD-S1
 4  2.013888888888889E-2     0         77     16 LM01-PRD-S1
 5 2.9861111111111113E-2     0         77     17 LM01-PRD-S1
 6                  <NA>  <NA>       <NA>   Exer LM01-PRD-S1
 7 1.0416666666666666E-2    25         90     24 LM01-PRD-S1
 8 1.9444444444444445E-2    25         92     23 LM01-PRD-S1
 9 3.0555555555555555E-2    25         93     22 LM01-PRD-S1
10                  <NA>  5055       Zoti      E LB02-PRD-S1
11              750 mmHg  <NA>       11/0   2016 LB02-PRD-S1
12 8.3333333333333332E-3     0         81     14 LB02-PRD-S1
13 1.6666666666666666E-2     0         96     15 LB02-PRD-S1
14 2.8472222222222222E-2     0         71     14 LB02-PRD-S1
15                  <NA>  <NA>       <NA>   Exer LB02-PRD-S1
16 1.0416666666666666E-2    35        102     16 LB02-PRD-S1
17 1.9444444444444445E-2    35        101     17 LB02-PRD-S1
18 3.0555555555555555E-2    35        105     15 LB02-PRD-S1

I tried this function, but I removed rows 1 to 15, while I would like to remove only rows 1 to 6 and 10 to 15.

df[-c(min(grep("5055",df[,power])):max(grep("Exer",df[,fr]))),]

Here is the final result I would like to obtain.

                   time power    hr    fr          id
1 1.0416666666666666E-2    25    90    24 LM01-PRD-S1
2 1.9444444444444445E-2    25    92    23 LM01-PRD-S1
3 3.0555555555555555E-2    25    93    22 LM01-PRD-S1
4 1.0416666666666666E-2    35   102    16 LB02-PRD-S1
5 1.9444444444444445E-2    35   101    17 LB02-PRD-S1
6 3.0555555555555555E-2    35   105    15 LB02-PRD-S1

I hope I explained well. Thank you for your help!

Assuming you'll have at least one "5055" in power and "Exer" in fr for each id we can create a sequence of index between the row numbers of those two occurrences and select the rows which lie out of it.

library(dplyr)

df %>%
  group_by(id) %>%
  filter(!row_number() %in% (which.max(power == "5055"):which.max(fr == "Exer")))

#  time                  power hr    fr    id         
#  <fct>                 <fct> <fct> <fct> <fct>      
#1 1.0416666666666666E-2 25    90    24    LM01-PRD-S1
#2 1.9444444444444445E-2 25    92    23    LM01-PRD-S1
#3 3.0555555555555555E-2 25    93    22    LM01-PRD-S1
#4 1.0416666666666666E-2 35    102   16    LB02-PRD-S1
#5 1.9444444444444445E-2 35    101   17    LB02-PRD-S1
#6 3.0555555555555555E-2 35    105   15    LB02-PRD-S1

data

df <- structure(list(time = structure(c(1L, 9L, 12L, 5L, 7L, 1L, 2L, 
4L, 8L, 1L, 10L, 11L, 3L, 6L, 1L, 2L, 4L, 8L), .Label = c("<NA>", 
"1.0416666666666666E-2", "1.6666666666666666E-2", "1.9444444444444445E-2", 
"2.013888888888889E-2", "2.8472222222222222E-2", "2.9861111111111113E-2", 
"3.0555555555555555E-2", "747mmHg", "750mmHg", "8.3333333333333332E-3", 
"9.7222222222222224E-3"), class = "factor"), power = structure(c(5L, 
1L, 2L, 2L, 2L, 1L, 3L, 3L, 3L, 5L, 1L, 2L, 2L, 2L, 1L, 4L, 4L, 
4L), .Label = c("<NA>", "0", "25", "35", "5055"), class = "factor"), 
hr = structure(c(15L, 2L, 8L, 9L, 9L, 1L, 11L, 12L, 13L, 
15L, 6L, 10L, 14L, 7L, 1L, 4L, 3L, 5L), .Label = c("<NA>", 
"09/0", "101", "102", "105", "11/0", "71", "76", "77", "81", 
"90", "92", "93", "96", "Zoti"), class = "factor"), fr = structure(c(10L, 
6L, 5L, 3L, 4L, 11L, 9L, 8L, 7L, 10L, 6L, 1L, 2L, 1L, 11L, 
3L, 4L, 2L), .Label = c("14", "15", "16", "17", "20", "2016", 
"22", "23", "24", "E", "Exer"), class = "factor"), id = structure(c(2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L), .Label = c("LB02-PRD-S1", "LM01-PRD-S1"), class = "factor")),
class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6", "7", "8", 
"9", "10", "11", "12", "13", "14", "15", "16", "17", "18"))

Remove rows containing certain string value : Rlanguage, I am now in 2nd year, again doing a data and stats class and we are using R within RStudio, and I am absolutely loving it. There is so much you can do, and there� You cannot actually delete a row, but you can access a dataframe without some rows specified by negative index. This process is also called subsetting in R language. To delete a row, provide the row number as index to the Dataframe. The syntax is shown below:

If you are using base R, then the following code may help you to make it:

r <- df[-unlist(lapply(data.frame(rbind(which(df$power == "5055"), which(df$fr == "Exer"))),function(v) seq(v[1],v[2]))),]

which gives

> r
                    time power  hr fr          id
7  1.0416666666666666E-2    25  90 24 LM01-PRD-S1
8  1.9444444444444445E-2    25  92 23 LM01-PRD-S1
9  3.0555555555555555E-2    25  93 22 LM01-PRD-S1
16 1.0416666666666666E-2    35 102 16 LB02-PRD-S1
17 1.9444444444444445E-2    35 101 17 LB02-PRD-S1
18 3.0555555555555555E-2    35 105 15 LB02-PRD-S1

Note: the line number remains the same as in the original data frame df. If you want to update the line number starting from 1, you can append rownames(r) <- seq(nrow(r)) to above codes, i.e.,

> rownames(r) <- seq(nrow(r))
> r
                   time power  hr fr          id
1 1.0416666666666666E-2    25  90 24 LM01-PRD-S1
2 1.9444444444444445E-2    25  92 23 LM01-PRD-S1
3 3.0555555555555555E-2    25  93 22 LM01-PRD-S1
4 1.0416666666666666E-2    35 102 16 LB02-PRD-S1
5 1.9444444444444445E-2    35 101 17 LB02-PRD-S1
6 3.0555555555555555E-2    35 105 15 LB02-PRD-S1

delete row if there is a specific word - General, If I have this dataframe prove<- data.frame(a=c(1,2,3), b=c("test_4", "test_4fail", " test_5"), c=c(8,4,5)) What is the most efficient way to delete� Let’s see how to delete or drop rows with multiple conditions in R with an example. Drop rows with missing and null values is accomplished using omit(), complete.cases() and slice() function. drop rows with condition in R using subset function; drop rows with null values or missing values using omit(), complete.cases() in R; drop rows with slice() function in R dplyr package

Using data.table:

setDT(df)
df[, .SD[cumsum(power == "5055" | shift(fr == "Exer")) %% 2 == 0], by = id]

            id                  time power  hr fr
1: LM01-PRD-S1 1.0416666666666666E-2    25  90 24
2: LM01-PRD-S1 1.9444444444444445E-2    25  92 23
3: LM01-PRD-S1 3.0555555555555555E-2    25  93 22
4: LB02-PRD-S1 1.0416666666666666E-2    35 102 16
5: LB02-PRD-S1 1.9444444444444445E-2    35 101 17
6: LB02-PRD-S1 3.0555555555555555E-2    35 105 15

Reproducible data:

df <- data.frame(
  time = c(
    "<NA>", "747mmHg", "9.7222222222222224E-3", "2.013888888888889E-2", 
    "2.9861111111111113E-2", "<NA>", "1.0416666666666666E-2", 
    "1.9444444444444445E-2", "3.0555555555555555E-2", "<NA>", "750mmHg", 
    "8.3333333333333332E-3", "1.6666666666666666E-2", "2.8472222222222222E-2", 
    "<NA>", "1.0416666666666666E-2", "1.9444444444444445E-2", 
    "3.0555555555555555E-2"
  ), 
  power = c(
    "5055", "<NA>", "0", "0", "0", "<NA>", "25", "25", "25", "5055", "<NA>", 
    "0", "0", "0", "<NA>", "35", "35", "35"
  ), 
  hr = c(
    "Zoti", "09/0", "76", "77", "77", "<NA>", "90", "92", "93", "Zoti", "11/0", 
    "81", "96", "71", "<NA>", "102", "101", "105"
  ), 
  fr = c(
    "E", "2016", "20", "16", "17", "Exer", "24", "23", "22", "E", "2016", "14", 
    "15", "14", "Exer", "16", "17", "15"
  ), 
  id = c(
    "LM01-PRD-S1", "LM01-PRD-S1", "LM01-PRD-S1", "LM01-PRD-S1", "LM01-PRD-S1", 
    "LM01-PRD-S1", "LM01-PRD-S1", "LM01-PRD-S1", "LM01-PRD-S1", "LB02-PRD-S1", 
    "LB02-PRD-S1", "LB02-PRD-S1", "LB02-PRD-S1", "LB02-PRD-S1", "LB02-PRD-S1", 
    "LB02-PRD-S1", "LB02-PRD-S1", "LB02-PRD-S1"
  ),
  stringsAsFactors = FALSE
)

Delete rows in dataframe in R, I need to search all cells (type string) that starts with "chr" and I need to this could help you. Delete rows containing specific strings in R. Deleting specific rows from a data frame (3 answers) Closed 5 months ago . Using R, how can I write the following logic into the dataframe: IF column A = B and Column E = 0, delete row

Selecting and removing rows in R dataframes, Import Data, Copy Data from Excel to R CSV & TXT Files | R Tutorial 1.5 Duration: 11:06 Posted: Apr 21, 2015 I have a data frame e.g.: sub day 1 1 1 2 1 3 1 4 2 1 2 2 2 3 2 4 3 1 3 2 3 3 3 4 and I would like to remove specific rows that can be

Conditionally Remove Row from Data Frame in R (Example , This page explains how to conditionally delete rows from a data frame in R programming. However, we can also remove rows according to multiple conditions and that's specific rows from a data table or matrix in the R programming language. R Capitalize First Letter of Each Word in Character String (3 Examples)� R, remove row if there is a certain character. 0. How to delete all rows in data table that contain a conserved string. 0. Error: Length of logical index vector must

How to filter/delete specific column values using R?, you would like to filter for rows which contain values between x and y in the second column How do I remove automated numbering of rows in R dataset? This was one of the commands: read.table (G:\\totalnetwork.txt , header = TRUE). Sample Random Rows of Data Frame; Extract Certain Columns of Data Frame; The R Programming Language . To summarize: This article explained how to return rows according to a matching condition in the R programming language. Please let me know in the comments, if you have further questions.

Comments
  • Thank you. The code you proposed delete also the rows for LM01-PRD-S1. The idea is to apply the code for each "id"
  • Hi @MaxStudent, as you can see the results attached, the rows for LM01-PRD-S1 are kept
  • Exactly, I got it too. Thanks! I also tried with more than 2 "id" but then, it just kept the last id.
  • Do you mean this code does not work when having more than 2 "id"s? That's weird, since it only filters out the rows between 5055 and Exer, so it should work for any number of "id"s.....Could you paste your test data in the post? I want to check it. Thanks @MaxStudent