r - check if every column is na

find columns with na in r
check if a column contains a value in r
is.na in r
how to check if a column contains na in r
check if something is na in r
dplyr count na
r remove rows with all na
is.na not working r

I have a list of columns within a dataframe which where i want to check if all those columns are NA and create a new column which tells me if they are NA or not.

Here is an example of it working with one column, where Any_Flag is my new column:

ItemStats_2014$Any_Flag <- ifelse(is.na(ItemStats_2014$Item_Flag_A), "Y", "N")

When i try to run the check over multiple columns, I am getting what i expect:

ItemStats_2014$Any_Flag <- ifelse(all(is.na(ItemStats_2014[ ,grep("Flag", names(ItemStats_2014), value = T)])), "Y", "N")

It returns everything to be false or "N".

Data

set.seed(1)
data <- c(LETTERS, NA)
df <- data.frame(Flag_A = sample(data), Flag_B = sample(data), 
                 C = sample(data), D = sample(data), Flag_E = sample(data))

df <- rbind(NA, df)

Code

Identifying all NAs per row:

> df$All_NA <- apply(df[, grep("Flag", names(df))], 1, function(x) all(is.na(x)))
> head(df)
  Flag_A Flag_B    C    D Flag_E All_NA
1   <NA>   <NA> <NA> <NA>   <NA>   TRUE
2      H      K    B    T      Y  FALSE
3      J      W    C    K      P  FALSE
4      O      I    H    I   <NA>  FALSE
5      V      L    M    S      R  FALSE
6      E      N    P    E      I  FALSE

Identifying at least one NA per row:

> df$Any_NA <- apply(df[, grep("Flag", names(df))], 1, function(x) anyNA(x))
> head(df)
  Flag_A Flag_B    C    D Flag_E Any_NA
1   <NA>   <NA> <NA> <NA>   <NA>   TRUE
2      H      K    B    T      Y  FALSE
3      J      W    C    K      P  FALSE
4      O      I    H    I   <NA>   TRUE
5      V      L    M    S      R  FALSE
6      E      N    P    E      I  FALSE

R - how to check if all rows in a dataframe are NaN?, Identifying the columns having NA or NaN in all the rows. ind <- colSums(is.na(​my.df)) == nrow(my.df) ind # a b # FALSE TRUE # to get the  Fortunately, the R programming language provides us with a function that helps us to deal with such missing data: the is.na function. In the following article, I’m going to explain what the function does and how the function can be applied in practice. Let’s dive in….

I think that you are trying to test if a row (not a column) contains at least one NA.

Here a dataset

x = c(1:10, NA)
df = data.frame(A = sample(x), B = sample(x), C = sample(x))

And here a loop that test that with anyNA

df$Any_na = apply(df[,2:3], 1, anyNA)
df

    A  B  C Any_na
1  NA  8  9  FALSE
2   5  9 NA   TRUE
3   9  3 10  FALSE
4   7  5  1  FALSE
5   4  2  3  FALSE
6  10  4  6  FALSE
7   3  1  2  FALSE
8   6  6  5  FALSE
9   1 10  7  FALSE
10  2 NA  8   TRUE
11  8  7  4  FALSE

Different ways to count NAs over multiple columns – Sebastian , There are a number of ways in R to count NAs (missing values). Summarise all selected columns by using the function 'sum(is.na(.))' The dot  Summarise all selected columns by using the function 'sum(is.na(.))' The dot . refers to what was handed over by the pipe, ie., the output of the last step. mtcars %>% select ( everything ()) %>% # replace to your needs summarise_all ( funs ( sum ( is.na ( .

I'm not sure what the grep part is supposed to do, but here's a simpler way to accomplish what you want:

 apply(ItemStats_2014[, 2:10], MARGIN = 1, FUN = function(x) all(is.na(x)))

Replace 2:10 with whatever columns you want to check.

Amendment: If you want to detect which columns contain the word "Flag" rather than hard coding their indices -- which would be better anyway! -- I like the package stringr for working with text. You could do this to select your columns:

 library(stringr)
 MyCols <- which(str_detect(names(ItemStats_2014), "Flag"))

Now, replace 2:10 with MyCols in the apply(... code above.

Checking for NA with dplyr – Sebastian Sauer Stats Blog, The select_if part choses any column where is.na is true ( TRUE ). Then we take those columns and for each of them, we sum up (  Stack Overflow for Teams is a private, secure spot for you and your coworkers to find and share information. Learn more Test for NA and select values based on result

And a data.table way without any apply is:

library(arsenal)
library(data.table)

# dummy data
set.seed(1)
data = c(LETTERS, NA)
dt = data.table(Flag_A=sample(data), Flag_B = sample(data), C=sample(data), D=sample(data), Flag_E=sample(data))
dt = rbind(NA, dt)

# All-NA/Any-NA check
columns_to_check = names(dt)[grep('Flag', names(dt))]
dt[, AllNA:=allNA(.SD), by=1:nrow(dt), .SDcols = columns_to_check]
dt[, AnyNA:=anyNA(.SD), by=1:nrow(dt), .SDcols = columns_to_check]

How can I check whether my data frame contains NA/Inf values in , Hello, How can I know whether my data frame contains NA/-Inf/Inf values or not? I want r, missing_values So any(is.na(x)) will return TRUE if any of the values of the object are NA . To your second question, if you want to learn the names of the column with these values, assign the output of our apply  Here is an example by using the iris dataset built in R and apply function in addiction with all that allows you to test if all elements of the objects you pass in it do respect one or more logical conditions. Do note that in this case the "objects" is a column of the data frame. The code with lapply do the same for every column.

This might help you get started :

# Sample dataframe
dfx <- data.frame(
x = c(21L, 21L, 21L, 22L, 22L, NA),
y = c(1449, 1814, 582, 582, 947, 183),
s = c(26.4, 28.7, 32, 25.3, NA, 25.7),
z = c(NA,NA,NA,NA,NA,NA)
)

# Sapply works well here 
ifelse(sapply(dfx, function(x)all(is.na(x))) == TRUE, "Y","N")

output :

 x   y   s   z 
"N" "N" "N" "Y"

Check if value from one dataframe exists in another, I am a beginner in R and want to know how it will works for each row. Eg. I have a two data frames. I want to see if the rows exist in the other  As they are written for speed, they blur over some of the subtleties of NaN and NA. If na.rm = FALSE and either NaN or NA appears in a sum, the result will be one of NaN or NA, but which might be platform-dependent. Value. A numeric or complex array of suitable size, or a vector if the result is one-dimensional.

21 Iteration, Determine the type of each column in nycflights13::flights . c = rnorm(10), d = rnorm(10) ) rescale01 <- function(x) { rng <- range(x, na.rm = TRUE) (x - rng[1])  R’s data frames offer you a great first step by allowing you to store your data in overviewable, rectangular grids. Each row of these grids corresponds to measurements or values of an instance, while each column is a vector containing data for a specific variable.

Using vectors and matrices in R, The answer is that R will find a common mode that can accomodate all the R will also recognize the unquoted string NA as a missing value when data is read from The last two arguments to matrix tell it the number of rows and columns the  (after coercion), after removing NAs if requested by na.rm = TRUE. The value returned is TRUE if at least one of the values in x is TRUE, and FALSE if all of the values in x are FALSE (including if there are no values). Otherwise the value is NA (which can only occur if na.rm = FALSE and … contains no TRUE values and at least one NA value).

Missing Values in R, Missing values are represented in R by the NA symbol. to index on that column a b NA NA NA # You get one of these NA rows for each NA in that column. It's fairly common to want to know the index of the missing values, and the which() 

Comments
  • this it what i have been trying to do, but i keep getting this error: Error in apply(ItemStats_2014[, grep("Item_Flag", names(ItemStats_2014))], : dim(X) must have a positive length
  • Can you paste the output of str(ItemStats_2014, list.len = 10) ?
  • i assume you are looking for the structure of those fields. there are many columns. The ones i am trying to use in the function are characters. i am actually using an if statement to do the opposite of you any flag output Just so your request makes a little more sense.
  • ` $ Item_Flag_A : chr NA NA NA NA ... $ Item_Flag_B : chr NA NA NA NA ... $ Item_Flag_C : chr NA NA NA NA ... $ Item_Flag_N : chr NA NA NA NA ... $ Item_Flag_O : chr NA NA NA NA ... $ Item_Flag_P : chr NA NA NA NA ... $ Item_Flag_R : chr "R" NA NA NA ... $ Item_Flag_V : chr NA NA NA NA ... $ Item_Flag_Z : chr NA NA NA NA ...`
  • str(ItemStats_2014) Classes ‘data.table’ and 'data.frame': 19435 obs. of 151 variables:
  • right a row but not of a whole data set. just the part of the dataset where 'Flag' is in the column name.
  • the grep is intended to call out columns where 'Flag' is in the name. is there any way to format to [, 2:10] to find the columns with 'flag' in the name
  • here is he error i get with apply: Error in apply(ItemStats_2014[, grep("Flag", names(ItemStats_2014))], : dim(X) must have a positive length