Skip to content

Var

Running variance using Welford's algorithm.

Parameters

  • ddof

    Default1

    Delta Degrees of Freedom. The divisor used in calculations is n - ddof, where n represents the number of seen elements.

Attributes

  • mean

    It is necessary to calculate the mean of the data in order to calculate its variance.

Examples

from river import stats

X = [3, 5, 4, 7, 10, 12]

var = stats.Var()
for x in X:
    print(var.update(x).get())
0.0
2.0
1.0
2.916666
7.7
12.56666

You can measure a rolling variance by using a utils.Rolling wrapper:

from river import utils

X = [1, 4, 2, -4, -8, 0]
rvar = utils.Rolling(stats.Var(ddof=1), window_size=3)
for x in X:
    print(rvar.update(x).get())
0.0
4.5
2.333333
17.333333
25.333333
16.0

Methods

get

Return the current value of the statistic.

revert
update

Update and return the called instance.

Parameters

  • x'numbers.Number'
  • w — defaults to 1.0

update_many

Notes

The outcomes of the incremental and parallel updates are consistent with numpy's batch processing when ddof1.


  1. Wikipedia article on algorithms for calculating variance 

  2. Chan, T.F., Golub, G.H. and LeVeque, R.J., 1983. Algorithms for computing the sample variance: Analysis and recommendations. The American Statistician, 37(3), pp.242-247. 

  3. Schubert, E. and Gertz, M., 2018, July. Numerically stable parallel computation of (co-)variance. In Proceedings of the 30th International Conference on Scientific and Statistical Database Management (pp. 1-12).