SRPRegressor¶
Streaming Random Patches ensemble regressor.
The Streaming Random Patches 1 ensemble method for regression trains each base learner on a subset of features and instances from the original data, namely a random patch. This strategy to enforce diverse base models is similar to the one in the random forest, yet it is not restricted to using decision trees as base learner.
This method is an adaptation of 2 for regression.
Parameters¶
-
model (base.Regressor) – defaults to
None
The base estimator.
-
n_models (int) – defaults to
10
Number of members in the ensemble.
-
subspace_size (Union[int, float, str]) – defaults to
0.6
Number of features per subset for each classifier where
M
is the total number of features.
A negative value meansM - subspace_size
.
Only applies when using random subspaces or random patches.
* Ifint
indicates the number of features to use. Valid range [2, M].
* Iffloat
indicates the percentage of features to use, Valid range (0., 1.].
* 'sqrt' -sqrt(M)+1
* 'rmsqrt' - Residual fromM-(sqrt(M)+1)
-
training_method (str) – defaults to
patches
The training method to use.
* 'subspaces' - Random subspaces.
* 'resampling' - Resampling.
* 'patches' - Random patches. -
lam (int) – defaults to
6
Lambda value for bagging.
-
drift_detector (base.DriftDetector) – defaults to
None
Drift detector.
-
warning_detector (base.DriftDetector) – defaults to
None
Warning detector.
-
disable_detector (str) – defaults to
off
Option to disable drift detectors:
* If'off'
, detectors are enabled.
* If'drift'
, disables concept drift detection and the background learner.
* If'warning'
, disables the background learner and ensemble members are reset if drift is detected. -
disable_weighted_vote (bool) – defaults to
True
If True, disables weighted voting.
-
drift_detection_criteria (str) – defaults to
error
The criteria used to track drifts.
* 'error' - absolute error.
* 'prediction' - predicted target values. -
aggregation_method (str) – defaults to
mean
The method to use to aggregate predictions in the ensemble.
* 'mean'
* 'median' -
seed – defaults to
None
Random number generator seed for reproducibility.
-
metric (Optional[river.metrics.base.RegressionMetric]) – defaults to
None
The metric to track members performance within the ensemble.
Attributes¶
- models
Examples¶
>>> from river import ensemble
>>> from river import evaluate
>>> from river import metrics
>>> from river.datasets import synth
>>> from river import tree
>>> dataset = synth.FriedmanDrift(
... drift_type='gsg',
... position=(350, 750),
... transition_window=200,
... seed=42
... ).take(1000)
>>> base_model = tree.HoeffdingTreeRegressor(grace_period=50)
>>> model = ensemble.SRPRegressor(
... model=base_model,
... training_method="patches",
... n_models=3,
... seed=42
... )
>>> metric = metrics.R2()
>>> evaluate.progressive_val_score(dataset, model, metric)
R2: 0.571117
Methods¶
append
S.append(value) -- append value to the end of the sequence
Parameters
- item
clear
S.clear() -> None -- remove all items from S
copy
count
S.count(value) -> integer -- return number of occurrences of value
Parameters
- item
extend
S.extend(iterable) -- extend sequence by appending elements from the iterable
Parameters
- other
index
S.index(value, [start, [stop]]) -> integer -- return first index of value. Raises ValueError if the value is not present.
Supporting start and stop arguments is optional, but recommended.
Parameters
- item
- args
insert
S.insert(index, value) -- insert value before index
Parameters
- i
- item
learn_one
pop
S.pop([index]) -> item -- remove and return item at index (default last). Raise IndexError if list is empty or index is out of range.
Parameters
- i – defaults to
-1
predict_one
Predict the output of features x
.
Parameters
- x
- kwargs
Returns
The prediction.
remove
S.remove(value) -- remove first occurrence of value. Raise ValueError if the value is not present.
Parameters
- item
reset
reverse
S.reverse() -- reverse IN PLACE
sort
Notes¶
This implementation uses n_models=10
as default given the impact on
processing time. The optimal number of models depends on the data and
resources available.
References¶
-
Heitor Gomes, Jacob Montiel, Saulo Martiello Mastelini, Bernhard Pfahringer, and Albert Bifet. On Ensemble Techniques for Data Stream Regression. IJCNN'20. International Joint Conference on Neural Networks. 2020. ↩
-
Heitor Murilo Gomes, Jesse Read, Albert Bifet. Streaming Random Patches for Evolving Data Stream Classification. IEEE International Conference on Data Mining (ICDM), 2019. ↩