SRPRegressor¶
Streaming Random Patches ensemble regressor.
The Streaming Random Patches 1 ensemble method for regression trains each base learner on a subset of features and instances from the original data, namely a random patch. This strategy to enforce diverse base models is similar to the one in the random forest, yet it is not restricted to using decision trees as base learner.
This method is an adaptation of 2 for regression.
Parameters¶
-
model
Type → base.Regressor | None
Default →
None
The base estimator.
-
n_models
Type → int
Default →
10
Number of members in the ensemble.
-
subspace_size
Type → int | float | str
Default →
0.6
Number of features per subset for each classifier where
M
is the total number of features.
A negative value meansM - subspace_size
.
Only applies when using random subspaces or random patches.
* Ifint
indicates the number of features to use. Valid range [2, M].
* Iffloat
indicates the percentage of features to use, Valid range (0., 1.].
* 'sqrt' -sqrt(M)+1
* 'rmsqrt' - Residual fromM-(sqrt(M)+1)
-
training_method
Type → str
Default →
patches
The training method to use.
* 'subspaces' - Random subspaces.
* 'resampling' - Resampling.
* 'patches' - Random patches. -
lam
Type → int
Default →
6
Lambda value for bagging.
-
drift_detector
Type → base.DriftDetector | None
Default →
None
Drift detector.
-
warning_detector
Type → base.DriftDetector | None
Default →
None
Warning detector.
-
disable_detector
Type → str
Default →
off
Option to disable drift detectors:
* If'off'
, detectors are enabled.
* If'drift'
, disables concept drift detection and the background learner.
* If'warning'
, disables the background learner and ensemble members are reset if drift is detected. -
disable_weighted_vote
Type → bool
Default →
True
If True, disables weighted voting.
-
drift_detection_criteria
Type → str
Default →
error
The criteria used to track drifts.
* 'error' - absolute error.
* 'prediction' - predicted target values. -
aggregation_method
Type → str
Default →
mean
The method to use to aggregate predictions in the ensemble.
* 'mean'
* 'median' -
seed
Default →
None
Random number generator seed for reproducibility.
-
metric
Type → RegressionMetric | None
Default →
None
The metric to track members performance within the ensemble.
Attributes¶
- models
Examples¶
from river import ensemble
from river import evaluate
from river import metrics
from river.datasets import synth
from river import tree
dataset = synth.FriedmanDrift(
drift_type='gsg',
position=(350, 750),
transition_window=200,
seed=42
).take(1000)
base_model = tree.HoeffdingTreeRegressor(grace_period=50)
model = ensemble.SRPRegressor(
model=base_model,
training_method="patches",
n_models=3,
seed=42
)
metric = metrics.R2()
evaluate.progressive_val_score(dataset, model, metric)
R2: 0.571117
Methods¶
learn_one
predict_one
Predict the output of features x
.
Parameters
- x
- kwargs
Returns
The prediction.
reset
Notes¶
This implementation uses n_models=10
as default given the impact on
processing time. The optimal number of models depends on the data and
resources available.
-
Heitor Gomes, Jacob Montiel, Saulo Martiello Mastelini, Bernhard Pfahringer, and Albert Bifet. On Ensemble Techniques for Data Stream Regression. IJCNN'20. International Joint Conference on Neural Networks. 2020. ↩
-
Heitor Murilo Gomes, Jesse Read, Albert Bifet. Streaming Random Patches for Evolving Data Stream Classification. IEEE International Conference on Data Mining (ICDM), 2019. ↩