Count NAs between first and last occured numbers

Related searches

Here is my toy dataset

df <- tribble(
  ~x, ~y, ~z,
  7,   NA, 4,
  8,   2,  NA,
  NA,  NA, NA,
  NA,  4,  6)

I want to get a dataframe with a number of NAs for each variable only between the first and the last occurrence of numbers in each column and number of NAs between the first occurred number and last row. So, for this example, the desired solution is

desired_df <- tribble(~vars, ~na_count_between_1st_last_num, ~na_count_between_1st_num_last_row,
                       "x",     0,                              2,
                       "y",     1,                              1,
                       "z",     2,                              2)

How can I get the desired output?

na.trim trims NAs off both ends or just the left or right end if we specify sides="left" or sides="right" so:


df %>%
  pivot_longer(everything()) %>%
  group_by(name) %>%
  summarize(na1 = sum(, 
            na2 = sum(, "left")))) %>%


# A tibble: 3 x 3
  name    na1   na2
  <chr> <int> <int>
1 x         0     2
2 y         1     1
3 z         2     2

Count NA Values in R (Example), How to determine the number of NA values in the R programming language - 3 examples - Count First, we have to create an example vector with NA values:� If you want to count or sum first n values in a row, you can do as below. Select a blank cell that you want to put the calculated result into, and enter this formula =SUM(OFFSET(A20,0,0,1,A23)), press Enter key to get the calculation. See screenshot: Tip: 1. To count first n numeric strings in a row, use this formula =COUNT(OFFSET(A20,0,0,1,A23)). 2.

Here is an idea via base R,

f1 <- function(x) {i1 <- which(!; head(i1, 1):tail(i1, 1) }
f2 <- function(x) {i1 <- which(!; head(i1, 1):length(x) }

merge(stack(sapply(df, function(i) sum([f1(i)])))), 
      stack(sapply(df, function(i) sum([f2(i)])))), by = 'ind')

#  ind values.x values.y
#1   x        0        2
#2   y        1        1
#3   z        2        2

Different ways to count NAs over multiple columns – Sebastian , There are a number of ways in R to count NAs (missing values). the question “ how many NAs are there in each column of my dataframe”? The dot . refers to what was handed over by the pipe, ie., the output of the last step. I want the count to show up in the first column for each row's data. For example column one should show a value of 6 for project "test8" first cell with data "0.5" then 4 blank cells, then last cell with data "4" . That's 2 cells with data, and 4 blank cells in between. The project duration is 6 weeks, 2 active and 4 inactive.

Here is one possibility using two functions:

fun1 <- function(x) { #count NA between first and last non NA
  idx1 <- cumsum(! > 0 #identify leading NA
  idx2 <- rev(cumsum(! > 0) #identify trailing NA
  sum([idx1 & idx2]))

fun2 <- function(x) {#count NA between first non-NA and last element
  idx1 <- cumsum(! > 0 #identify leading NA

Afterwards you just summarise your data.frame and reshape it:

df %>% summarise_all(list(m1 = ~fun1(.), m2 = ~fun2(.))) %>%
  pivot_longer(cols = everything(), names_pattern = "^(.)_(.*)$", names_to = c("vars", "a"),
               values_to = "x") %>%
  spread(a, x)

# A tibble: 3 x 3
  vars     m1    m2
  <chr> <int> <int>
1 x         0     2
2 y         1     1
3 z         2     2

How do I count the number of NaNs in a Vector?, Learn more about nan, array, vector, counting, count MATLAB. I want to look for the column a(i,:) which contains the least number of Nans, how do I do To get the final answer we want to find which column has the smallest sum. where the ~ notation allows for the first output argument to be ignored. An Error Occurred. tally() is a convenient wrapper for summarise that will either call n() or sum(n) depending on whether you're tallying for the first time, or re-tallying. count() is similar but calls group_by() before and ungroup() after. If the data is already grouped, count() adds an additional group that is removed afterwards. add_tally() adds a column n to a table based on the number of items within each

Here is another option using data.table::nafill:

natrail <- colSums(, "nocb"))))
nastart <- colSums(, "locf"))))    
n1last <- nrow(df) - colSums(! - nastart
n1num <- n1last - natrail

cbind(na_count_between_1st_last_num=n1num, na_count_between_1st_num_last_row=n1last)


  na_count_between_1st_last_num na_count_between_1st_num_last_row
x                             0                                 2
y                             1                                 1
z                             2                                 2
a                             1                                 2
b                             0                                 0
d                             0                                 1


df <- data.frame(x=c(7,8,NA,NA), #0 2
    y=c(NA, 2, NA, 4),           #1 1
    z=c(4, NA, NA, 6),           #2 2
    a=c(1, NA, 1, NA),           #1 2
    b=c(NA, NA, 1, 1),           #0 0
    d=c(NA, 1, 1, NA))           #0 1

COUNTIF in Excel - count if not blank, greater than , Formula examples to count blank and non-blank cells, with values First, we will briefly cover the syntax and general usage, and then I What you see in the image below is the list of the best tennis players for the last 14 years. E for duplicate values (i.e. check if the value in E1 occurs in any other cell in� In SPSS and SAS, you can do this with sorts and first/last options. To do this in R, we first order the data and then use the by command. The by command will effectively subset our data based on indicated variables and return an indicated number of observations from the beginning or end ("head" or "tail") of that subset.

Excel COUNT and COUNTA functions with formula examples, So let's take a quick look at these essential functions first, and then I will The COUNTA function in Excel counts the number of cells in a range� Algorithm to find first and last digits of a number. Get least significant digit of number (number%10) and store it in lastDigit variable. Remove least significant digit form number (number = number/10). Repeat above two steps, till number is greater than 10. The remaining number is the first digit of number.

The Duration Calculator calculates the number of days, months and years between two dates.

By George W. Bush’s last day in office, the S&P 500 stood 37 percent below where it had been on the last trading day before he first took office in 2001. Other stock market indexes tell similar

  • On my real data, I get the output as # A tibble: 1 x 2 vars <NA> <chr> <int> 1 NA 0 , not sure why. Is it because of the names_pattern = "^(.)_(.*)$"
  • probabily because of names_pattern. Best check ?pivot_longer to see how to use names_pattern. Maybe you can simply use names_pattern = "^(.*)_(.*)$".