Skip to content

R2

R-Squared

R-Squared (RS) 1 is the complement of the ratio of sum of squared distances between objects in different clusters to the total sum of squares. It is an intuitive and simple formulation of measuring the differences between clusters.

The maximum value of R-Squared is 1, which means that the higher the index, the better the clustering results.

Attributes

  • bigger_is_better

    Indicates if a high value is better than a low one or not.

Examples

>>> from river import cluster
>>> from river import stream
>>> from river import metrics

>>> X = [
...     [1, 2],
...     [1, 4],
...     [1, 0],
...     [4, 2],
...     [4, 4],
...     [4, 0],
...     [-2, 2],
...     [-2, 4],
...     [-2, 0]
... ]

>>> k_means = cluster.KMeans(n_clusters=3, halflife=0.4, sigma=3, seed=0)
>>> metric = metrics.cluster.R2()

>>> for x, _ in stream.iter_array(X):
...     k_means = k_means.learn_one(x)
...     y_pred = k_means.predict_one(x)
...     metric = metric.update(x, y_pred, k_means.centers)

>>> metric
R2: 0.509203

Methods

get

Return the current value of the metric.

revert

Revert the metric.

Parameters

  • x
  • y_pred
  • centers
  • sample_weight – defaults to 1.0
update

Update the metric.

Parameters

  • x
  • y_pred
  • centers
  • sample_weight – defaults to 1.0
works_with

Indicates whether or not a metric can work with a given model.

Parameters

  • model (river.base.estimator.Estimator)

References


  1. Halkidi, M., Vazirgiannis, M., & Batistakis, Y. (2000). Quality Scheme Assessment in the Clustering Process. Principles Of Data Mining And Knowledge Discovery, 265-276. DOI: 10.1007/3-540-45372-5_26