WB¶
WB Index
WB Index is a simple sum-of-square method, calculated by dividing the within cluster sum-of-squares by the between cluster sum-of-squares. Its effect is emphasized by multiplying the number of clusters. The advantages of the proposed method are that one can determine the number of clusters by minimizing the WB value, without relying on any knee point detection, and this metric is straightforward to implement.
The lower the WB index, the higher the clustering quality is.
Attributes¶
-
bigger_is_better
Indicates if a high value is better than a low one or not.
Examples¶
>>> from river import cluster
>>> from river import stream
>>> from river import metrics
>>> X = [
... [1, 2],
... [1, 4],
... [1, 0],
... [4, 2],
... [4, 4],
... [4, 0],
... [-2, 2],
... [-2, 4],
... [-2, 0]
... ]
>>> k_means = cluster.KMeans(n_clusters=3, halflife=0.4, sigma=3, seed=0)
>>> metric = metrics.cluster.WB()
>>> for x, _ in stream.iter_array(X):
... k_means = k_means.learn_one(x)
... y_pred = k_means.predict_one(x)
... metric = metric.update(x, y_pred, k_means.centers)
>>> metric
WB: 1.300077
Methods¶
get
Return the current value of the metric.
revert
Revert the metric.
Parameters
- x
- y_pred
- centers
- sample_weight – defaults to
1.0
update
Update the metric.
Parameters
- x
- y_pred
- centers
- sample_weight – defaults to
1.0
works_with
Indicates whether or not a metric can work with a given model.
Parameters
- model (river.base.estimator.Estimator)
References¶
-
Q. Zhao, M. Xu, and P. Franti, "Sum-of-squares based cluster validity index and significance analysis," in Adaptive and Natural Computing Algorithms, M. Kolehmainen, P. Toivanen, and B. Beliczynski, Eds. Berlin, Germany: Springer, 2009, pp. 313–322. ↩