Silhouette¶
Silhouette coefficient 1, roughly speaking, is the ratio between cohesion and the average distances from the points to their second-closest centroid. It rewards the clustering algorithm where points are very close to their assigned centroids and far from any other centroids, that is, clustering results with good cohesion and good separation.
It rewards clusterings where points are very close to their assigned centroids and far from any other centroids, that is clusterings with good cohesion and good separation. 2
The definition of Silhouette coefficient for online clustering evaluation is different from that of batch learning. It does not store information and calculate pairwise distances between all points at the same time, since the practice is too expensive for an incremental metric.
Attributes¶
-
bigger_is_better
Indicates if a high value is better than a low one or not.
Examples¶
>>> from river import cluster
>>> from river import stream
>>> from river import metrics
>>> X = [
... [1, 2],
... [1, 4],
... [1, 0],
... [4, 2],
... [4, 4],
... [4, 0],
... [-2, 2],
... [-2, 4],
... [-2, 0]
... ]
>>> k_means = cluster.KMeans(n_clusters=3, halflife=0.4, sigma=3, seed=0)
>>> metric = metrics.Silhouette()
>>> for x, _ in stream.iter_array(X):
... k_means = k_means.learn_one(x)
... y_pred = k_means.predict_one(x)
... metric = metric.update(x, y_pred, k_means.centers)
>>> metric
Silhouette: 0.32145
Methods¶
get
Return the current value of the metric.
revert
Revert the metric.
Parameters
- x
- y_pred
- centers
- sample_weight – defaults to
1.0
update
Update the metric.
Parameters
- x
- y_pred
- centers
- sample_weight – defaults to
1.0
works_with
Indicates whether or not a metric can work with a given model.
Parameters
- model (river.base.estimator.Estimator)