Filter rows with rowwise max value more than threshold value

r extract rows with certain value
r subset dataframe by column value
r select rows by condition
r subset dataframe by multiple column value
how to select top 10 rows in r
r subset dataframe by list of values
r filter rows by condition
dplyr slice

I have a data frame such as:

x <- data.frame("Names"= c("name1","name2","name3"), "A" = c(0.1,0.1,0.8), "B" = c(0.3,0.4,0.3), "C" = c(0.05,0.9,0.05),"D" =c(0.6,0.1,0.3))

> x
  Names   A   B    C   D
1 name1 0.1 0.3 0.05 0.6
2 name2 0.1 0.4 0.90 0.1
3 name3 0.8 0.3 0.05 0.3

And what I would like is to remove all lines where the Max value of A , B , C or D is below 0.8. And then, get:

> x
  Names   A   B    C   D
2 name2 0.1 0.4 0.90 0.1
3 name3 0.8 0.3 0.05 0.3

The name1 was removed because 0.6 was the max value.

And then I would like to get a file such as I get the NameX with the column name where the value is the max, in this exemple it would be:

Name1 : C with value 0.9
Name2 : A with value 0.8 

Thank you for your help.

You can use pmax, i.e.

x[, x[-1]) >= 0.8,]
#  Names   A   B    C   D
#2 name2 0.1 0.4 0.90 0.1
#3 name3 0.8 0.3 0.05 0.3

Large-Scale Annotation of Biomedical Data and Expert Label , For fully-connected layers, there are two types: row-wise sparsity and column-​wise sparsity. since it is the most computationally intensive layer in current DNNs and our the number of nonzero filters in Wi that is less than a predefined value αi. The largest αi columns evaluated by l2 norm are kept and the remaining  Select Rows with Maximum Value on a Column Example 2. In this example, we will show how to select rows with max value along with remaining columns. It is useful if you want to return the remaining columns (non-group by columns). For this SQL Server example, we used the Inner Join to join the employee table with itself.

To filter rows you could do something like this using any

df <- x[apply(x[, -1], 1, function(x) any(x >= 0.8)), ]
#  Names   A   B    C   D
#2 name2 0.1 0.4 0.90 0.1
#3 name3 0.8 0.3 0.05 0.3

As for your second question, I'm not sure what you're trying to do. If this is about generating a vector of "result" strings you could do

apply(df, 1, function(x) {
    idx <- which.max(x[-1])
    sprintf("%s: %s with value %s", x[1], colnames(df)[idx + 1], x[-1][idx]) })
#                         2                          3
#"name2: C with value 0.90"  "name3: A with value 0.8"

Or if you prefer a data.frame perhaps something like this

ret <- data.frame(result = rep("", nrow(df)), stringsAsFactors = F)
for (i in 1:nrow(df)) {
    idx <- which.max(df[i, -1])
    ret$result[i] <- sprintf(
        "%s: %s with value %s", 
        df[i, 1], colnames(df)[idx + 1], df[i, -1][idx])
#                   result
#1 name2: C with value 0.9
#2 name3: A with value 0.8

Web Microanalysis of Big Image Data, Therefore, the maximum possible error in the computed translation values is in H as having the same dxi and the same dyi for each row in V within a ±2r limit. Thus, we filter H column-wise and V row-wise where we replace all computed whose dxi or dyi values deviate from the median value by more than 4×r (the  How to filter out values which are greater than Learn more about maximum value, threshold, filter out

x[rowSums(x[-1] >= 0.8) != 0, ]

  Names   A   B    C   D
2 name2 0.1 0.4 0.90 0.1
3 name3 0.8 0.3 0.05 0.3

Computer Vision, Collect remaining input and network states row-wise into a matrix M. will not be binary in the general case and more than one output can have non-zero values. if the ratio to the second highest value is above some defined threshold θL. not switch within short time intervals we apply median filtering to remove outliers. The filter() function is used to subset a data frame, retaining all rows that satisfy your conditions. To be retained, the row must produce a value of TRUE for all conditions. Note that when a condition evaluates to NA the row will be dropped, unlike base subsetting with [.

A data.table solution :

x <- data.table::data.table(x)
x [ pmax(A,B,C,D) >= .8 , , ]
x [  , paste(colnames(x)[1+which(c(A,B,C,D)==(max(A,B,C,D)))], " with value ", max(A,B,C,D)), by=Names]

Subset Data Frame Rows in R, Filter rows within a selection of variables; Remove missing values; Select for less than or equal to; >=: for greater than or equal to; ==: for equal to each other != Some clever answers, here, but the question might portray some confusion about data structures generally. Rows are usually considered to be "records," meaning that all rows contain the same data points (the cells), all of which refer to one entity

determine if a number is greater than elements in a row matrix in R, guessing that you want to do this in the context of filtering expression note that this will yield NA whenever the compared values are equal,  Other columns you specify in the view filter may or may not be indexed, but the view does not use those indexes. The first column of the filter should return fewer items than the List View Threshold. If the first column of the filter returns more items than the List View Threshold, you can use a filter with two or more columns.

Function to replace values for specific rows · Issue #425 · tidyverse , GitHub is home to over 50 million developers working together to host and review min lq median uq max neval # modify(tbl, x = x + 100, .filter = a == 2) 54.96051 I guess other *_if functions would rather replace a call to filter than a call to ifelse , e.g. Changing a variable value for specific rows #1897. Often, you may want to subset a pandas dataframe based on one or more values of a specific column. Essentially, we would like to select rows based on one value or multiple values present in a column. Here are SIX examples of using Pandas dataframe to filter rows or select rows based values of a column(s).

Select top (or bottom) n rows (by value), Column-wise operations · Row-wise operations · Programming with dplyr · More News Select top (or bottom) n rows (by value) top_n() has been superseded in favour of slice_min() / slice_max() . Will include more rows if there are ties. df <- data.frame(x = c(6, 4, 1, 10, 3, 1, 1)) df %>% top_n(2) # highest values. Go to Row Label filter –> Value Filters –> Greater Than. In the Value Filter dialog box: Select the values you want to use for filtering. In this case, it is the Sum of Sales (if you have more items in the values area, the drop down would show all of it).

  • Related