How can I remove NAs when both columns are missing only?

r remove rows with na in one column
remove rows with all na in r
r remove rows with all na dplyr
na.omit r
subset to remove na in r
dplyr remove columns with na
dplyr remove rows with na in a column
remove na rows in r dplyr

I have a df in R as follows:

ID    Age   Score1     Score2      
2      22    12           NA
3      19    11           22
4      20    NA           NA
1      21    NA           20

Now I want to only remove the rows where both Score 1 and Score 2 is missing (i.e. 3rd row)

You can filter it like this:

df <- read.table(head=T, text="ID    Age   Score1     Score2      
2      22    12           NA
3      19    11           22
4      20    NA           NA
1      21    NA           20")
df[!(is.na(df$Score1) & is.na(df$Score2)), ]
#   ID Age Score1 Score2
# 1  2  22     12     NA
# 2  3  19     11     22
# 4  1  21     NA     20

I.e. take rows where there's not (!) Score1 missing and (&) Score2 missing.

Remove rows with all or some NAs (missing values) in data.frame , na.omit is nicer for just removing all NA 's. complete.cases allows partial selection by including only certain columns of the dataframe: where only row 5 is removed since it is the only row containing NAs for both rnor AND cfam . | up vote 17 down vote It seeems like you want to remove ONLY columns with ALL NAs, leaving columns with some rows that do have NAs. I would do this (but I am sure there is an efficient vectorised soution: #set seed for reproducibility. set.seed <- 103

Here are two version with dplyr which can be extended to many columns with prefix "Score".

Using filter_at

library(dplyr)

df %>% filter_at(vars(starts_with("Score")), any_vars(!is.na(.)))

#  ID Age Score1 Score2
#1  2  22     12     NA
#2  3  19     11     22
#3  1  21     NA     20

and filter_if

df %>% filter_if(startsWith(names(.),"Score"), any_vars(!is.na(.)))

A base R version with apply

df[apply(!is.na(df[startsWith(names(df),"Score")]), 1, any), ]

Remove rows with all or some NAs (missing values) in , (a)To remove all rows with NA values, we use na.omit() function. In your case: final <- na.omit(dataframe). Output: a b c d e f. 2 YASH00000199774 0 2 2 2 2. I have a pairwise correlation matrix of SNPs and some columns and rows returned only NAs. When trying to omit or in any way delete these rows or columns, all the data is deleted. Do you have any ideas of how I can do this without deleting all the data and just getting rid of the NA columns and rows? Any help would be greatly appreciated.

One option is rowSums

df1[ rowSums(is.na(df1[grep("Score", names(df1))])) < 2,]

Or another option with base R

df1[!Reduce(`&`, lapply(df1[grep("Score", names(df1))], is.na)),]
data
df1 <- structure(list(ID = c(2L, 3L, 4L, 1L), Age = c(22L, 19L, 20L, 
 21L), Score1 = c(12L, 11L, NA, NA), Score2 = c(NA, 22L, NA, 20L
 )), class = "data.frame", row.names = c(NA, -4L))

Remove rows with missing values on columns specified, This is a data.table method for the S3 generic stats::na.omit . argument cols , which when specified looks for missing values in just those columns specified. It seeems like you want to remove ONLY columns with ALL NAs, leaving columns with some rows that do have NAs. I would do this (but I am sure there is an efficient vectorised soution: I would do this (but I am sure there is an efficient vectorised soution:

Remove All-NA Columns from Data Frame in R (Example), How to delete columns containing only of NA values in the R programming language. More Duration: 3:05 Posted: Feb 17, 2020 A nice capacity of this function that is very useful when removing rows with NAs (missing values), is that it allows to pass a whole dataframe, or if you want, you can just pass a single column. In the section below we will walk through several examples of how to remove rows with NAs (missing values).

Remove columns and rows which have only NAs without deleting all , To remove rows and columns with NA: Both approaches give absolutely the same result but piping makes onlyNArows_idx,] # NA row disappeared [,1] [,2] [ ,3] [,4] [1,] 1 6 11 21 [2,] 3 8 13 23 [3,] 4 NA 14 24 [4,] 5 10 15 25. There are a lot of posts about replacing NA values. I am aware that one could replace NAs in the following table/frame with the following: x[is.na(x)]<-0 But, what if I want to restrict it to only certain columns? Let's me show you an example. First, let's start with a dataset.

Drop rows containing missing values — drop_na • tidyr, Source: R/drop-na.R. drop_na.Rd. Drop rows tidy-select > Columns to inspect for missing values. A tibble: 2 x 2 #> x y #> <dbl> <chr> #> 1 1 a #> 2 2 NA. The conditions of this comparison can be easily modified to fit your needs. Then make a new matrix I call mr (stands for m reduced) where you have removed the columns defined by the vector, wntg. In this simple example I have done the case where you want to exclude columns with more than 2 NAs. wntg<-which(colSums(is.na(m))>2) mr<-m[,-c(wntg)]

Comments
  • df %>% filter(!is.na(Score1) | !is.na(Score2))