## Dummy variable "switch-point" in R

I have a dummy variable that serves as a flag for a number of conditions in my data set. **I can't figure out how to write a function that marks the spot in which the flag assumes a "final switch" -- a value that will not change for the rest of the data frame.** *In the example below, everything after the 7th observation is a "y".*

dplyr::tibble( observation = c(seq(1,10)), crop = c(runif(3,1,25), runif(1,50,100), runif(2,1,10), runif(4,50,100)), flag = c(rep("n", 3), rep("y", 1), rep("n", 2), rep("y", 4)))

Which yields:

observation crop flag <int> <dbl> <chr> 1 1 13.3 n 2 2 4.34 n 3 3 17.1 n 4 4 80.5 y 5 5 9.62 n 6 6 8.39 n 7 7 92.6 y 8 8 74.1 y 9 9 95.3 y 10 10 69.9 y

I've tried creating a second flag that marks every switch and returns the "final" switch/flag variable, but over my whole data frame that will likely be highly inefficient. Any suggestions are welcome and appreciated.

One way to do this may be to create a flag that cumulatively sums occurrences of flag switches.

cumsum_na <- function(x){ x[which(is.na(x))] <- 0 return(cumsum(x)) } df <- dplyr::tibble( observation = c(seq(1,10)), crop = c(runif(3,1,25), runif(1,50,100), runif(2,1,10), runif(4,50,100)), flag = c(rep("n", 3), rep("y", 1), rep("n", 2), rep("y", 4))) df %>% mutate(flag2 = ifelse(flag != lag(flag), 1, 0) %>% cumsum_na) # A tibble: 10 x 4 observation crop flag flag2 <int> <dbl> <chr> <dbl> 1 1 12.1 n 0 2 2 11.2 n 0 3 3 4.66 n 0 4 4 61.6 y 1 5 5 6.00 n 2 6 6 9.54 n 2 7 7 67.6 y 3 8 8 86.7 y 3 9 9 91.6 y 3 10 10 84.5 y 3

You can then do whatever you need to using the `flag2`

column (eg. filter for max value, take first row, which will give you the first occurrence of constant state).

i count all the "n" first, and when when the final "n" is met, i get the index of the next obs

i=0 j=1 while (i<table(df$flag)["n"]) { if (as.character(df[j,3]) =="n" ) { i=i+1 j=j+1 } else j=j+1 }

You are looking for j

We can make use of `rleid`

from `data.table`

library(data.table) setDT(df)[, flag2 := rleid(flag)] df # observation crop flag flag2 # 1: 1 21.472985 n 1 # 2: 2 21.563190 n 1 # 3: 3 1.393184 n 1 # 4: 4 88.422562 y 2 # 5: 5 6.383627 n 3 # 6: 6 8.484030 n 3 # 7: 7 86.998953 y 4 # 8: 8 62.220592 y 4 # 9: 9 93.141503 y 4 #10: 10 96.006885 y 4

##### Comments

- By constant, I mean never to switch again for the rest of the data frame/observations. Will clarify that in original question.
- I edited my answer; let me know if this doesn't solve your problem!
- Appreciate your help on this.
- rleid seems useful. I'm trying to keep most of my operations in tibbles/dplyr, but I appreciate you offering this answer. Good to know about.
- @BSHuniversity. No problem, You can just change the syntax to
`df %>% mutate(flag2 = rleid(flag))`