## Moving variance in R

I know that the `filter()`

function in R calculate the moving average. I would like to know if exists a function that return me the moving variance or standard deviation, in order to show it in a plot side by side with the output of `filter()`

function.

Consider the **zoo** package. For example `filter()`

gives:

> filter(1:100, rep(1/3,3)) Time Series: Start = 1 End = 100 Frequency = 1 [1] NA 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 [26] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 [51] 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 [76] 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 NA

whereas `rollmean()`

in **zoo** gives:

> rollmean(1:100, k = 3, na.pad = TRUE) [1] NA 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 [26] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 [51] 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 [76] 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 NA

which is the same (for a 3 point moving average in this example).

Whilst **zoo** doesn't have a `rollsd()`

or `rollvar()`

it does have `rollapply()`

, which works like the `apply()`

functions to apply any R function to the specified window.

> rollapply(1:100, width = 3, FUN = sd, na.pad = TRUE) [1] NA 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [26] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [51] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [76] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 NA Warning message: In rollapply.zoo(zoo(data), ...) : na.pad argument is deprecated

or on something more interesting:

> rollapply(vec, width = 3, FUN = sd, na.pad = TRUE) [1] NA 0.3655067 0.8472871 0.5660495 0.3491970 0.4732417 0.9236859 [8] 0.8075226 1.8725851 1.1930784 0.6329325 1.1412416 0.8430772 0.5808005 [15] 0.3838545 1.1738170 1.1655400 1.3241700 0.6876834 0.1534157 0.4858477 [22] 0.9843506 0.6002713 0.6897541 2.0619563 2.5675788 6.3522039 6.0066864 [29] 6.2618432 5.1704866 2.1360853 2.5602557 1.0408528 1.0316396 4.9441628 [36] 5.0319314 5.7589716 3.2425000 4.8788158 2.0847286 4.5199291 2.5323486 [43] 2.1987149 1.8393000 1.2278639 1.5998965 1.5341485 4.4287108 4.4159166 [50] 4.3224546 3.6959067 4.9826264 5.3134044 8.4084322 9.1249234 7.5506725 [57] 3.8499136 3.9680487 5.6362296 4.9124095 4.3452706 4.0227141 4.5867559 [64] 4.7350394 4.3203807 4.4506799 7.2759499 7.6536424 7.8487654 2.0905576 [71] 4.0056880 5.6209853 1.5551659 1.3615268 2.8469458 2.8323588 1.9848578 [78] 1.1201124 1.4248380 1.7802571 1.4281773 2.5481935 1.8554451 1.0925410 [85] 2.1823722 2.2788755 2.4205378 2.0733741 0.7462248 1.3873578 1.4265948 [92] 0.7212619 0.7425993 1.0696432 2.4520585 3.0555819 3.1000885 1.0945292 [99] 0.3726928 NA Warning message: In rollapply.zoo(zoo(data), ...) : na.pad argument is deprecated

You can get rid of the warning by using the `fill = NA`

argument, as in

> rollapply(vec, width = 3, FUN = sd, fill = NA)

**runsd function,** Moving (aka running, rolling) Window's Standard Deviation calculated over a vector. The exponential moving average is a weighted moving average that reduces influences by applying more weight to recent data points reduction factor 2/(n+1); or r for``running", this is an exponential moving average with a reduction factor of 1/n [same as the modified average?].

The TTR package has `runSD`

among others:

> library(TTR) > ls("package:TTR", pattern="run*") [1] "runCor" "runCov" "runMAD" "runMax" "runMean" [6] "runMedian" "runMin" "runSD" "runSum" "runVar"

`runSD`

will be much faster than `rollapply`

because it avoids making many R function calls. For example:

rzoo <- function(x,n) rollapplyr(x, n, sd, fill=NA) rttr <- function(x,n) runSD(x, n) library(rbenchmark) set.seed(21) x <- rnorm(1e4) all.equal(rzoo(x,250), rttr(x,250)) # [1] TRUE benchmark(rzoo(x,250), rttr(x,250))[,1:6] # test replications elapsed relative user.self sys.self # 2 rttr(x, 250) 100 0.58 1.000 0.58 0.00 # 1 rzoo(x, 250) 100 54.53 94.017 53.85 0.06

**[R] moving mean and moving variance functions,** [R] moving mean and moving variance functions. Gabor Grothendieck ggrothendieck at gmail.com. Mon Apr 4 15:17:16 CEST 2011. Previous message: [R]� The main incentive to write this set of functions was relative slowness of majority of moving window functions available in R and its packages. With the exception of runmed, a running window median function, all functions listed in "see also" section are slower than very inefficient “apply(embed(x,k),1,FUN)” approach.

`rollapply`

in the `zoo`

package takes an arbitrary function. It's different from `filter`

in that it excludes the `NA`

s by default.

That being said, though, there's not much sense in loading a package for a function that's so simple to roll yourself (pun intended).

Here's one that's right aligned:

my.rollapply <- function(vec, width, FUN) sapply(seq_along(vec), function(i) if (i < width) NA else FUN(vec[i:(i-width+1)])) set.seed(1) vec <- sample(1:50, 50) my.rollapply(vec, 3, sd) [1] NA NA 7.094599 12.124356 16.522712 18.502252 18.193405 7.234178 8.144528 [10] 14.468356 12.489996 3.055050 20.808652 19.467922 18.009257 18.248288 15.695010 7.505553 [19] 10.066446 11.846237 17.156146 6.557439 5.291503 23.629078 22.590558 21.197484 22.810816 [28] 24.433583 19.502137 16.165808 11.503623 12.288206 9.539392 13.051181 13.527749 19.974984 [37] 19.756855 17.616280 19.347696 18.248288 15.176737 6.082763 10.000000 10.016653 4.509250 [46] 2.645751 1.527525 5.291503 10.598742 6.557439 # rollapply output for comparison rollapply(vec, width=3, sd, fill=NA, align='right') [1] NA NA 7.094599 12.124356 16.522712 18.502252 18.193405 7.234178 8.144528 [10] 14.468356 12.489996 3.055050 20.808652 19.467922 18.009257 18.248288 15.695010 7.505553 [19] 10.066446 11.846237 17.156146 6.557439 5.291503 23.629078 22.590558 21.197484 22.810816 [28] 24.433583 19.502137 16.165808 11.503623 12.288206 9.539392 13.051181 13.527749 19.974984 [37] 19.756855 17.616280 19.347696 18.248288 15.176737 6.082763 10.000000 10.016653 4.509250 [46] 2.645751 1.527525 5.291503 10.598742 6.557439

**[PDF] Package 'roll',** "Updating Mean and Variance Estimates: An Improved Method. A list of objects with the rolling and expanding r-squareds for each y. the estimated population variance is 8.4 square inches, and the estimated population standard deviation is 2.92 inches (rounded off). Using R to compute standard deviation As is the case with variance, using R to compute the standard deviation is easy: You use the sd() function.

`runner`

function in runner package applies any R function on running windows. With `runner`

one can specify simple window by setting the length `k`

or `lag`

. Below moving `sd`

as suggested by OOP on 4-elements windows.

library(runner) set.seed(1) x <- rnorm(20, sd = 1) runner(x, sd, k = 4, na_pad = TRUE) #[1] NA NA NA 1.1021597 0.9967429 1.1556947 0.9884053 0.6902835 0.7180483 0.4647160 #[11] 0.7454670 0.7489618 0.9449882 1.5821988 1.4459037 1.3889432 1.3954101 0.6193867 0.5296744 0.4266423

To apply running functions on date-windows one should specify `idx`

. Length of `idx`

should be the same length as `x`

and should be of type Date or integer. Example below illustrates window of size `k = 4`

lagged by `lag = 1`

. In parentheses index ranges for each window.

idx <- c(4, 6, 7, 13, 17, 18, 18, 21, 27, 31, 37, 42, 44, 47, 48) runner::runner(x = 1:15, k = 5, lag = 1, idx = idx, f = function(x) mean(x)) # [1] NA 1.0 1.5 NA 4.0 4.5 4.5 6.0 NA 9.0 NA 11.0 12.0 12.5 13.5

More info in documentation and vignettes

**Convenience Functions, Moving Window Statistics, and Graphics,** Functions for calculating moving-window statistics. The base R function range returns the minimum and maximum of a vector, but the for σ2: the MLE (n in denominator) vs. the sample variance (n−1 in denominator). The international passenger data series (G) time series data requires more robust methods such as Moving Median, Kernal Smoothing, ARIMA, or UCM (see “Unobserved Component Models using R”). Nevertheless, R offers several useful function for exponential smoothing, including some not discussed here, for instance in the QCC-Package.

**Rolling variance in r,** This random variable (proportion of 6s) has mean p =1/6 and variance p*(1-p)/n. 67$. The rolling regression estimator of R is defined as: Thus the weights are� In this case, because we used a moving range of length 2, the average moving range gives us an estimate of the average distance between our consecutive individual data points. A moving range of length 2 is Minitab’s default, but that can be changed by clicking the I-MR Options button in the I-MR chart dialog, and then choosing the Estimate tab:

**Moving Window Statistics — GSL 2.6 documentation,** The moving window variance calculates the sample variance of the values of ei = gsl_ran_gaussian(r, 0.1); gsl_vector_set(x, i, xi + ei); } /* compute moving� In a moving average regression model, a variable of interest is assumed to be a weighted moving average of unobserved independent error terms; the weights in the moving average are parameters to be estimated.

**R help - moving mean and moving variance functions,** moving mean and moving variance functions. Hello Lets say as an example I have a dataframe with the following attributes: rownum(1:405),� Let \(w_t \overset{iid}{\sim} N(0, \sigma^2_w)\), meaning that the w t are identically, independently distributed, each with a normal distribution having mean 0 and the same variance. The 1 st order moving average model, denoted by MA(1) is: \(x_t = \mu + w_t +\theta_1w_{t-1}\) The 2 nd order moving average model, denoted by MA(2) is:

##### Comments

- Thank you Gavin, I guess this is an add-on question, but do you know of a base R solution that uses
`ts`

? - @Zhubarb nope, not a single function. You could code this out of the base R parts (as
`rollapply()`

has been), but there is nothing canned*that I am aware of* - Nice package, thank you! For other users, it looks like if you supply a vector of length
`<n`

this function yields an error, which means if you apply it to many groups in a dataset (e.g. different stocks, using`data.table`

), and any of them has too little data, the whole operation will yield an error of type`is outside valid range:...`

. Users could solve this by wrapping the function to give different behavior (like returning`NA`

s) when too little data is supplied. - @MichaelOhlrogge: thanks for the comment! I like the idea of returning all
`NA`

when too little data is supplied. I'll consider how to implement something like that directly in the TTR functions (see #68). - nice solution. It is a little faster than
`rollapplyr`

. By some reason, in`sd`

, in about 500 data, there is a little difference: 1e-14. Why there should be rounding(?) issues?