conditional rolling average in R

rolling calculations in r
r rolling sum by group
dplyr rolling average
rolling sum in r
rollapply r
rolling mean in r
cumulative sum in r
rolling window in r
    dat <- structure(list(yearRef = c(1970, 1971, 1972, 1973, 1974, 1975, 
    1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 
    1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 
    1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 
    2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018), 
    value = c(0.761253538863966, 0.778365700864592, 0.748473422160476, 
    0.790408287413012, 0.726707786670043, 0.80587461240495, 0.81582881742434, 
    0.914998995290579, 0.903241004636529, 0.883446087736501, 
    0.878399385374308, 0.790239960507709, 0.853841173129717, 
    0.972923769177295, 0.899133969911117, 0.865840008976815, 
    0.85942147306247, 0.9471790327507, 0.905362802563981, 0.91644169495142, 
    0.985789564141214, 0.978212191208007, 0.885157529562834, 
    1.01638026873823, 1.02702020472382, 0.944421276774342, 0.979439113456467, 
    0.951183598644539, 1.12054063623421, 1.00767230122493, 1.02132151007705, 
    0.95649988168142, 0.928385199359045, 1.05071183719421, 1.11654102944792, 
    0.910601547182633, 0.936460862711605, 1.2398210426787, 0.979036947391532, 
    1.09931214756341, 1.12206830109171, 0.997384903912461, 1.07413151131128, 
    0.967026290186151, 1.04921352764649, 1.08746580600605, 1.02444885186573, 
    1.14604631626466, 1.06449109417896)), class = c("tbl_df", 
    "tbl", "data.frame"), row.names = c(NA, -49L))

For each year, I want to calculate the mean of top 5 values from the previous 7 values. For e.g. the first mean value will be for 1977 and will consist of mean of best 5 years from 1970 till 1976. Similarly, for 1978, mean value will be the top 5 values from 1971-1977. Similarly, for 2018, the mean value will be top 5 values from 2011 - 2017

I have the following code from SO which sort of does the job.

  library(data.table)
  library(zoo)

  setDT(dat)

  dat[, mean.val:= if (.N > 6) 
        rollapplyr(value, 7,function(x) mean(tail(sort(x), 5)), fill = NA)  
        else mean(value)] 

though the first value in the new column mean.val is correct, it should be assigned to the row which has 1977 but has been assigned to 1976.

You want to process the PRIOR 7 points rather than the 7 points that end at the current point. To do that use a width of list(-(1:7)). That says to use offsets -1 through -7 when processing the data. See ?rollapply for more information on specifying the width argument.

This (1) more directly specifies the intention making it easier to comprehend than approaches which require ignoring the required offsets and then fixing it up later and (2) uses only the packages you are already using (3) expresses the solution compactly and (4) preserves your solution changing only one argument.

  dat[, mean.val:= if (.N > 6) 
        rollapply(value, list(-(1:7)), function(x) mean(tail(sort(x), 5)), fill = NA)  
        else mean(value)] 

Conditional rolling mean (moving average) on irregular time series , You can easily fill in the missing Minutes (Value will be set to NA), then use rollapply library(data.table) library(zoo) ## Convert to data.table DT� During the Covid-19 pandemic, rolling averages have been used by researchers and journalists around the world to understand and visualize cases and deaths. This post will cover how to compute and visualize rolling averages for the new confirmed cases and deaths from Covid-19 in the United States. Packages

If the only issue is that the values should be shifted down 1 row, you can use shift to fix this.

dat[, mean.val := shift(mean.val)]

FYI if you're on version >= 1.12.4 data.table you don't need zoo and can use data.table::frollapply.

dat[, mean.val2 := 
      shift(frollapply(value, 7, function(x) mean(tail(sort(x), 5))))]

dat[, all.equal(mean.val, mean.val2)] #TRUE

Understanding rolling calculations in R, behaviours around rolling calculations and alignments. We can retrieve earlier values by using the lag() function from dplyr [1]. This by default looks one value� In R, we often need to get values or perform calculations from information not on the same row. We need to either retrieve specific values or we need to produce some sort of aggregation. This post explores some of the options and explains the weird (to me at least!) behaviours around rolling calculations and alignments. We can retrieve earlier values by using the lag() function from dplyr[1

This simple for loop solve the problem:

dat$mean.val = NA

for(i in 8:nrow(dat))
{
  dat$mean.val[i] = mean(sort(dat$value[(i-7):(i-1)],decreasing = TRUE)[1:5])
}

conditional rolling average, Re: Conditional moving average � Yuri Fal Ambassador Sep 30, 2018 11:46 PM ( in response to A few more steps and a little bit repetitive base R solution: I'm very new to R (and coding in general), and I'm using R Studio. I had a question about how create a new variable, that is an average value of another variable (but based on the level of a third variable). I am doing a meta-analysis with my dataset, metacomplete_, and I'm trying to average effect-sizes (variable: *_selectedES.prepost_*) into one value per paper (variable Paper#). Basically

I think you can use the excellent tsibble package for an amazing rolling function and then you can use the lead function to displace the results

library(tidyverse)

dat <- structure(list(yearRef = c(1970, 1971, 1972, 1973, 1974, 1975, 
                                  1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 
                                  1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 
                                  1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 
                                  2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018), 
                      value = c(0.761253538863966, 0.778365700864592, 0.748473422160476, 
                                0.790408287413012, 0.726707786670043, 0.80587461240495, 0.81582881742434, 
                                0.914998995290579, 0.903241004636529, 0.883446087736501, 
                                0.878399385374308, 0.790239960507709, 0.853841173129717, 
                                0.972923769177295, 0.899133969911117, 0.865840008976815, 
                                0.85942147306247, 0.9471790327507, 0.905362802563981, 0.91644169495142, 
                                0.985789564141214, 0.978212191208007, 0.885157529562834, 
                                1.01638026873823, 1.02702020472382, 0.944421276774342, 0.979439113456467, 
                                0.951183598644539, 1.12054063623421, 1.00767230122493, 1.02132151007705, 
                                0.95649988168142, 0.928385199359045, 1.05071183719421, 1.11654102944792, 
                                0.910601547182633, 0.936460862711605, 1.2398210426787, 0.979036947391532, 
                                1.09931214756341, 1.12206830109171, 0.997384903912461, 1.07413151131128, 
                                0.967026290186151, 1.04921352764649, 1.08746580600605, 1.02444885186573, 
                                1.14604631626466, 1.06449109417896)), class = c("tbl_df", 
                                                                                "tbl", "data.frame"), row.names = c(NA, -49L))

complex_function <- . %>% 
  sort %>% 
  tail(.,5) %>% 
  mean

dat %>%
  mutate(roll_avg  = tsibble::slide_dbl(.x = value,.f = complex_function,.size = 7),
         roll_avg2 = lag(roll_avg))
#> # A tibble: 49 x 4
#>    yearRef value roll_avg roll_avg2
#>      <dbl> <dbl>    <dbl>     <dbl>
#>  1    1970 0.761   NA        NA    
#>  2    1971 0.778   NA        NA    
#>  3    1972 0.748   NA        NA    
#>  4    1973 0.790   NA        NA    
#>  5    1974 0.727   NA        NA    
#>  6    1975 0.806   NA        NA    
#>  7    1976 0.816    0.790    NA    
#>  8    1977 0.915    0.821     0.790
#>  9    1978 0.903    0.846     0.821
#> 10    1979 0.883    0.865     0.846
#> # … with 39 more rows

Created on 2020-01-14 by the reprex package (v0.3.0)

Conditional Mean in R based on Two conditions : rstats, Hello. I am trying to calculate the conditional mean in R for some data I am working with. My data is basketball statistics for individual players … Examples on how to calculate conditional probabilities of dependent events, What is Conditional Probability, Formula for Conditional Probability, How to find the Conditional Probability from a word problem, examples with step by step solutions, How to use real world examples to explain conditional probability

5 Volatility Models, So far we have discussed models for the expected return, that is E(Rt+1), but now we are Volatility: σt+1 is the standard deviation of the shock conditional on The Moving Average (MA) estimate of the variance in day t, σ2t+1, is given by� Rolling Average Periods: Although assessing these more demanding phases of matches has long been a research target [12-14], we are now starting to obtain reliable information about the demands of these periods. Previous research conducted with professional soccer players tried to identify those more demanding phases of the competition by

Brownian moving averages have conditional full support, The continuous moving average processes considered by Cherny (2008) . condition that ensures that the law of G (i) has full support in C(R + )� There are quite a few R functions/packages for calculating moving averages. The purpose of this article is to compare a bunch of them and see which is fastest. Here are the 10 functions I’ll be looking at, in alphabetical order (Disclaimer: the accelerometry package is mine). filter in package stats (part of R install) ma in package forecast

How to calculate a rolling average in R, Rolling or moving averages are a way to reduce noise and smooth time series data. During the Covid-19 pandemic, rolling averages have� Estimation of Conditional Average Treatment Effects A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Craig Anthony Rolling IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Doctor of Philosophy Yuhong Yang, Advisor July, 2014

Comments
  • Is the issue only that the results should be shifted down 1 row? If so, you can fix that with the shift function, i.e. dat[, mean.val := shift(mean.val)]
  • Yeah I think the problem is that 1970 until 1976 consists of 7 values (rows). Therefore the value gets assigned at 1976. Shifting the data one row is the easiest solution.
  • In your complext_function, do you mean to say mean instead of sum?
  • Yes well spotted fixing it
  • @89_Simple Does it work for you now? I would throw away the year 1975 for correctness but still
  • not really. Working by hand, the mean value for 1977 should be 0.7902 (mean of 0.761, 0.778, 0.790, 0.806, 0.816)
  • Oh I guess you need lag