Skip to content

EmpiricalCovariance

Empirical covariance matrix.

Parameters

  • ddof

    Default1

    Delta Degrees of Freedom.

Attributes

  • matrix

Examples

import numpy as np
import pandas as pd
from river import covariance

np.random.seed(42)
X = pd.DataFrame(np.random.random((8, 3)), columns=["red", "green", "blue"])
X
        red     green      blue
0  0.374540  0.950714  0.731994
1  0.598658  0.156019  0.155995
2  0.058084  0.866176  0.601115
3  0.708073  0.020584  0.969910
4  0.832443  0.212339  0.181825
5  0.183405  0.304242  0.524756
6  0.431945  0.291229  0.611853
7  0.139494  0.292145  0.366362

cov = covariance.EmpiricalCovariance()
for x in X.to_dict(orient="records"):
    cov = cov.update(x)
cov
        blue     green    red
 blue    0.076    0.020   -0.010
green    0.020    0.113   -0.053
  red   -0.010   -0.053    0.079

There is also an update_many method to process mini-batches. The results are identical.

cov = covariance.EmpiricalCovariance()
cov = cov.update_many(X)
cov
        blue     green    red
 blue    0.076    0.020   -0.010
green    0.020    0.113   -0.053
  red   -0.010   -0.053    0.079

The covariances are stored in a dictionary, meaning any one of them can be accessed as such:

cov["blue", "green"]
Cov: 0.020292

Diagonal entries are variances:

cov["blue", "blue"]
Var: 0.076119

Methods

revert

Downdate with a single sample.

Parameters

  • x'dict'

update

Update with a single sample.

Parameters

  • x'dict'

update_many

Update with a dataframe of samples.

Parameters

  • X'pd.DataFrame'