SRPRegressor¶
Streaming Random Patches ensemble regressor.
The Streaming Random Patches 1 ensemble method for regression trains each base learner on a subset of features and instances from the original data, namely a random patch. This strategy to enforce diverse base models is similar to the one in the random forest, yet it is not restricted to using decision trees as base learner.
This method is an adaptation of 2 for regression.
Parameters¶
-
model (base.Regressor) – defaults to
NoneThe base estimator.
-
n_models (int) – defaults to
10Number of members in the ensemble.
-
subspace_size (Union[int, float, str]) – defaults to
0.6Number of features per subset for each classifier where
Mis the total number of features.
A negative value meansM - subspace_size.
Only applies when using random subspaces or random patches.
* Ifintindicates the number of features to use. Valid range [2, M].
* Iffloatindicates the percentage of features to use, Valid range (0., 1.].
* 'sqrt' -sqrt(M)+1
* 'rmsqrt' - Residual fromM-(sqrt(M)+1) -
training_method (str) – defaults to
patchesThe training method to use.
* 'subspaces' - Random subspaces.
* 'resampling' - Resampling.
* 'patches' - Random patches. -
lam (float) – defaults to
6.0Lambda value for bagging.
-
drift_detector (base.DriftDetector) – defaults to
NoneDrift detector.
-
warning_detector (base.DriftDetector) – defaults to
NoneWarning detector.
-
disable_detector (str) – defaults to
offOption to disable drift detectors:
* If'off', detectors are enabled.
* If'drift', disables concept drift detection and the background learner.
* If'warning', disables the background learner and ensemble members are reset if drift is detected. -
disable_weighted_vote (bool) – defaults to
TrueIf True, disables weighted voting.
-
drift_detection_criteria (str) – defaults to
errorThe criteria used to track drifts.
* 'error' - absolute error.
* 'prediction' - predicted target values. -
aggregation_method (str) – defaults to
meanThe method to use to aggregate predictions in the ensemble.
* 'mean'
* 'median' -
seed – defaults to
NoneRandom number generator seed for reproducibility.
-
metric (river.metrics.base.RegressionMetric) – defaults to
NoneMetric to track members performance within the ensemble.
Examples¶
>>> from river import synth
>>> from river import ensemble
>>> from river import tree
>>> from river import evaluate
>>> from river import metrics
>>> from river import neighbors
>>> dataset = synth.FriedmanDrift(
... drift_type='gsg',
... position=(350, 750),
... transition_window=200,
... seed=42
... ).take(1000)
>>> base_model = neighbors.KNNRegressor()
>>> model = ensemble.SRPRegressor(
... model=base_model,
... n_models=3,
... seed=42
... )
>>> metric = metrics.R2()
>>> evaluate.progressive_val_score(dataset, model, metric)
R2: 0.525003
Methods¶
clone
Return a fresh estimator with the same parameters.
The clone has the same parameters but has not been updated with any data. This works by looking at the parameters from the class signature. Each parameter is either - recursively cloned if it's a River classes. - deep-copied via copy.deepcopy if not. If the calling object is stochastic (i.e. it accepts a seed parameter) and has not been seeded, then the clone will not be idempotent. Indeed, this method's purpose if simply to return a new instance with the same input parameters.
learn_one
Fits to a set of features x and a real-valued target y.
Parameters
- x (dict)
- y (numbers.Number)
- kwargs
Returns
self
predict_one
Predicts the target value of a set of features x.
Parameters
- x
Returns
The prediction.
reset
Notes¶
This implementation uses n_models=10 as default given the impact on
processing time. The optimal number of models depends on the data and
resources available.
References¶
-
Heitor Gomes, Jacob Montiel, Saulo Martiello Mastelini, Bernhard Pfahringer, and Albert Bifet. On Ensemble Techniques for Data Stream Regression. IJCNN'20. International Joint Conference on Neural Networks. 2020. ↩
-
Heitor Murilo Gomes, Jesse Read, Albert Bifet. Streaming Random Patches for Evolving Data Stream Classification. IEEE International Conference on Data Mining (ICDM), 2019. ↩