VBeta¶
VBeta.
VBeta (or V-Measure) 1 is an external entropy-based cluster evaluation measure. It provides an elegant solution to many problems that affect previously defined cluster evaluation measures including
-
Dependance of clustering algorithm or dataset,
-
The "problem of matching", where the clustering of only a portion of data points are evaluated, and
-
Accurate evaluation and combination of two desirable aspects of clustering, homogeneity and completeness.
Based upon the calculations of homogeneity and completeness, a clustering solution's V-measure is calculated by computing the weighted harmonic mean of homogeneity and completeness,
Parameters¶
-
beta (float) – defaults to
1.0
Weight of Homogeneity in the harmonic mean.
-
cm – defaults to
None
This parameter allows sharing the same confusion matrix between multiple metrics. Sharing a confusion matrix reduces the amount of storage and computation time.
Attributes¶
-
bigger_is_better
Indicate if a high value is better than a low one or not.
-
requires_labels
Indicates if labels are required, rather than probabilities.
-
works_with_weights
Indicate whether the model takes into consideration the effect of sample weights
Examples¶
>>> from river import metrics
>>> y_true = [1, 1, 2, 2, 3, 3]
>>> y_pred = [1, 1, 1, 2, 2, 2]
>>> metric = metrics.VBeta(beta=1.0)
>>> for yt, yp in zip(y_true, y_pred):
... print(metric.update(yt, yp).get())
1.0
1.0
0.0
0.3437110184854507
0.4580652856440158
0.5158037429793888
>>> metric
VBeta: 51.58%
Methods¶
get
Return the current value of the metric.
is_better_than
revert
Revert the metric.
Parameters
- y_true
- y_pred
- sample_weight – defaults to
1.0
update
Update the metric.
Parameters
- y_true
- y_pred
- sample_weight – defaults to
1.0
works_with
Indicates whether or not a metric can work with a given model.
Parameters
- model (river.base.estimator.Estimator)
References¶
-
Andrew Rosenberg and Julia Hirschberg (2007). V-Measure: A conditional entropy-based external cluster evaluation measure. Proceedings of the 2007 Joing Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 410 - 420, Prague, June 2007. ↩