## Rolling average using groupby and varying window length

I'm trying to create a rolling average of a column based on an ID column and a measurement time label in R, but I am having a lot of trouble with it.

Here is what my dataframe looks like:

ID Measurement Value A 1 10 A 2 12 A 3 14 B 1 10 B 2 12 B 3 14 B 4 10

The problem is that I have measurement counts varying from 9 to 76 for each ID so I haven't found a solution that will create a column of a rolling average for each ID while handling the varying window length.

My goal is a dataframe like this:

ID Measurement Value Average A 1 10 NA A 2 12 11 A 3 14 12 B 1 10 NA B 2 12 11 B 3 14 12 B 4 10 11.5

This uses no packages. It calculates the cumulative average by ID except that for `Measurement`

equal to 1 it forces the average to be `NA`

.

transform(DF, Avg = ave(Value, ID, FUN = cumsum) / ifelse(Measurement == 1, NA, Measurement))

giving:

ID Measurement Value Avg 1 A 1 10 NA 2 A 2 12 11.0 3 A 3 14 12.0 4 B 1 10 NA 5 B 2 12 11.0 6 B 3 14 12.0 7 B 4 10 11.5

##### Note

The input `DF`

in reproducible form is:

Lines <- "ID Measurement Value A 1 10 A 2 12 A 3 14 B 1 10 B 2 12 B 3 14 B 4 10" DF <- read.table(text = Lines, header = TRUE, strip.white = TRUE, as.is = TRUE)

**pandas.DataFrame.rolling,** Each window will be a variable sized based on the observations included in the Rolling sum with a window length of 2, using the 'triang' window type. I want to calculate the rolling mean of modal_price grouped by (APMC,Commodity) for each year with window_length as no. of months of that year . According to my solution I'm getting all Nan's .

With your data:

library(dplyr) dat %>% group_by(Id) %>% mutate(Avrg = cumsum(Value)/(1:n())) # A tibble: 7 x 4 # Groups: Id [2] Id Measurement Value Avrg <chr> <int> <int> <dbl> 1 A 1 10 10 2 A 2 12 11 3 A 3 14 12 4 B 1 10 10 5 B 2 12 11 6 B 3 14 12 7 B 4 10 11.5

Data:

structure(list(Id = c("A", "A", "A", "B", "B", "B", "B"), Measurement = c(1L, 2L, 3L, 1L, 2L, 3L, 4L), Value = c(10L, 12L, 14L, 10L, 12L, 14L, 10L) ), class = "data.frame", row.names = c(NA, -7L))

P.S. I am pretty sure that the average of 10 is 10, not NA

**Moving Averages in pandas,** The moving average is mostly used with time series data to capture the short-term Let's calculate SMA for a window size of 3, which means you will fix the position of that in the iloc function while the row will be a variable i Further, by varying the window (the number of observations included in the rolling calculation), we can vary the sensitivity of the window calculation. This is useful in comparing fast and slow moving averages (shown later). Combining a rolling mean with a rolling standard deviation can help detect regions of abnormal volatility and consolidation.

library(dplyr) data %>% group_by(ID) %>% mutate(rolling_mean = cummean(Value))

First row will be mean of first value for each group (ID), not NA.

**Rolling window with step size · Issue #15354 · pandas-dev/pandas ,** Just a suggestion - extend rolling to support a rolling window with a step size, The additional step of max/mean doesn't work for my use case. API? how to do a "rolling groupby" or groupby with overlapping groups? generated by the indexer classes and unify the rolling_*_fixed/variable functions. What query would you write to calculate moving averages? Answer to Calculate Moving Averages. Before we get into the SQL let’s make sure we know what we’re calculating! To calculate the 10-day moving average of the closing price, we need to calculate the prices of current and past 9 days closing prices. We do the same for the 30-day moving

**rolling functions, rolling aggregates, sliding window, moving ,** [R-Forge #2187] Add/document rolling mean, median etc.. combined with i #624 how to calculate different window sizes for different columns of rows, while rolling functions always returns vector of same length as input. The difference is, with GROUP BY you can only have the aggregated values for Rolling sum with a window length of 2, min_periods defaults to the window length. >>> df . rolling ( 2 ) . sum () B 0 NaN 1 1.0 2 3.0 3 NaN 4 NaN Same as above, but explicitly set the min_periods

**Window Functions in Python and SQL,** Understanding how to execute these functions in both SQL and Python can help away from thinking about them in terms of how they are different, and more in terms of how they are alike. This can be the same size as the partition or smaller. Window functions calculate measures such as a 14-day moving average, Choose a rolling window size, m, i.e., the number of consecutive observation per rolling window. The size of the rolling window depends on the sample size, T , and periodicity of the data. In general, you can use a short rolling window size for data collected in short intervals, and a larger size for data collected in longer intervals.

related issue: #25 Note: there is a bug using groupby with rolling on specific column for now, so we are not using the `on` parameter in rolling. pandas-dev/pandas#13966 Copy link Quote reply

##### Comments

- valuable answer,facing the same problem