SRPRegressor¶
Streaming Random Patches ensemble regressor.
The Streaming Random Patches 1 ensemble method for regression trains each base learner on a subset of features and instances from the original data, namely a random patch. This strategy to enforce diverse base models is similar to the one in the random forest, yet it is not restricted to using decision trees as base learner.
This method is an adaptation of 2 for regression.
Parameters¶
-
model
Type → base.Regressor | None
Default →
NoneThe base estimator.
-
n_models
Type → int
Default →
10Number of members in the ensemble.
-
subspace_size
Type → int | float | str
Default →
0.6Number of features per subset for each classifier where
Mis the total number of features.
A negative value meansM - subspace_size.
Only applies when using random subspaces or random patches.
* Ifintindicates the number of features to use. Valid range [2, M].
* Iffloatindicates the percentage of features to use, Valid range (0., 1.].
* 'sqrt' -sqrt(M)+1
* 'rmsqrt' - Residual fromM-(sqrt(M)+1) -
training_method
Type → str
Default →
patchesThe training method to use.
* 'subspaces' - Random subspaces.
* 'resampling' - Resampling.
* 'patches' - Random patches. -
lam
Type → int
Default →
6Lambda value for bagging.
-
drift_detector
Type → base.DriftDetector | None
Default →
NoneDrift detector.
-
warning_detector
Type → base.DriftDetector | None
Default →
NoneWarning detector.
-
disable_detector
Type → str
Default →
offOption to disable drift detectors:
* If'off', detectors are enabled.
* If'drift', disables concept drift detection and the background learner.
* If'warning', disables the background learner and ensemble members are reset if drift is detected. -
disable_weighted_vote
Type → bool
Default →
TrueIf True, disables weighted voting.
-
drift_detection_criteria
Type → str
Default →
errorThe criteria used to track drifts.
* 'error' - absolute error.
* 'prediction' - predicted target values. -
aggregation_method
Type → str
Default →
meanThe method to use to aggregate predictions in the ensemble.
* 'mean'
* 'median' -
seed
Default →
NoneRandom number generator seed for reproducibility.
-
metric
Type → RegressionMetric | None
Default →
NoneThe metric to track members performance within the ensemble.
Attributes¶
- models
Examples¶
from river import ensemble
from river import evaluate
from river import metrics
from river.datasets import synth
from river import tree
dataset = synth.FriedmanDrift(
drift_type='gsg',
position=(350, 750),
transition_window=200,
seed=42
).take(1000)
base_model = tree.HoeffdingTreeRegressor(grace_period=50)
model = ensemble.SRPRegressor(
model=base_model,
training_method="patches",
n_models=3,
seed=42
)
metric = metrics.R2()
evaluate.progressive_val_score(dataset, model, metric)
R2: 0.571117
Methods¶
learn_one
predict_one
Predict the output of features x.
Parameters
- x
- kwargs
Returns
The prediction.
reset
Notes¶
This implementation uses n_models=10 as default given the impact on
processing time. The optimal number of models depends on the data and
resources available.
-
Heitor Gomes, Jacob Montiel, Saulo Martiello Mastelini, Bernhard Pfahringer, and Albert Bifet. On Ensemble Techniques for Data Stream Regression. IJCNN'20. International Joint Conference on Neural Networks. 2020. ↩
-
Heitor Murilo Gomes, Jesse Read, Albert Bifet. Streaming Random Patches for Evolving Data Stream Classification. IEEE International Conference on Data Mining (ICDM), 2019. ↩