Skip to content

BaggingClassifier

Online bootstrap aggregation for classification.

For each incoming observation, each model's learn_one method is called k times where k is sampled from a Poisson distribution of parameter 1. k thus has a 36% chance of being equal to 0, a 36% chance of being equal to 1, an 18% chance of being equal to 2, a 6% chance of being equal to 3, a 1% chance of being equal to 4, etc. You can do scipy.stats.utils.random.poisson(1).pmf(k) to obtain more detailed values.

Parameters

  • model

    Typebase.Classifier

    The classifier to bag.

  • n_models

    Default10

    The number of models in the ensemble.

  • seed

    Typeint | None

    DefaultNone

    Random number generator seed for reproducibility.

Attributes

  • models

Examples

In the following example three logistic regressions are bagged together. The performance is slightly better than when using a single logistic regression.

from river import datasets
from river import ensemble
from river import evaluate
from river import linear_model
from river import metrics
from river import optim
from river import preprocessing

dataset = datasets.Phishing()

model = ensemble.BaggingClassifier(
    model=(
        preprocessing.StandardScaler() |
        linear_model.LogisticRegression()
    ),
    n_models=3,
    seed=42
)

metric = metrics.F1()

evaluate.progressive_val_score(dataset, model, metric)
F1: 87.65%

print(model)
BaggingClassifier(StandardScaler | LogisticRegression)

Methods

learn_one
predict_one

Predict the label of a set of features x.

Parameters

  • x'dict'
  • kwargs

Returns

base.typing.ClfTarget | None: The predicted label.

predict_proba_one

Averages the predictions of each classifier.

Parameters

  • x
  • kwargs