EpsilonGreedyRegressor¶
Model selection based on the \(\eps\)-greedy bandit strategy.
Performs model selection by using an \(\eps\)-greedy bandit strategy. A model is selected for each learning step. The best model is selected (1 - \(\eps\)%) of the time.
Selection bias is a common problem when using bandits for online model selection. This bias can be mitigated by using a burn-in phase. Each model is given the chance to learn during the first burn_in
steps.
Parameters¶
-
models
The models to choose from.
-
metric – defaults to
None
The metric that is used to compare models with each other. Defaults to
metrics.MAE
. -
epsilon – defaults to
0.1
The fraction of time exploration is performed rather than exploitation.
-
decay – defaults to
0.0
Exponential factor at which
epsilon
decays. -
burn_in – defaults to
100
The number of initial steps during which each model is updated.
-
seed (int) – defaults to
None
Random number generator seed for reproducibility.
Attributes¶
-
best_model
The current best model.
-
burn_in
-
decay
-
epsilon
-
models
-
seed
Examples¶
>>> from river import datasets
>>> from river import evaluate
>>> from river import linear_model
>>> from river import metrics
>>> from river import model_selection
>>> from river import optim
>>> from river import preprocessing
>>> models = [
... linear_model.LinearRegression(optimizer=optim.SGD(lr=lr))
... for lr in [0.0001, 0.001, 1e-05, 0.01]
... ]
>>> dataset = datasets.TrumpApproval()
>>> model = (
... preprocessing.StandardScaler() |
... model_selection.EpsilonGreedyRegressor(
... models,
... epsilon=0.1,
... decay=0.001,
... burn_in=100,
... seed=1
... )
... )
>>> metric = metrics.MAE()
>>> evaluate.progressive_val_score(dataset, model, metric)
MAE: 1.363516
>>> model['EpsilonGreedyRegressor'].bandit
Ranking MAE Pulls Share
#2 15.850129 111 8.53%
#1 13.060601 117 8.99%
#3 16.519079 109 8.38%
#0 1.387839 964 74.10%
>>> model['EpsilonGreedyRegressor'].best_model
LinearRegression (
optimizer=SGD (
lr=Constant (
learning_rate=0.01
)
)
loss=Squared ()
l2=0.
l1=0.
intercept_init=0.
intercept_lr=Constant (
learning_rate=0.01
)
clip_gradient=1e+12
initializer=Zeros ()
)
Methods¶
append
S.append(value) -- append value to the end of the sequence
Parameters
- item
clear
S.clear() -> None -- remove all items from S
copy
count
S.count(value) -> integer -- return number of occurrences of value
Parameters
- item
extend
S.extend(iterable) -- extend sequence by appending elements from the iterable
Parameters
- other
index
S.index(value, [start, [stop]]) -> integer -- return first index of value. Raises ValueError if the value is not present.
Supporting start and stop arguments is optional, but recommended.
Parameters
- item
- args
insert
S.insert(index, value) -- insert value before index
Parameters
- i
- item
learn_one
Fits to a set of features x
and a real-valued target y
.
Parameters
- x (dict)
- y (numbers.Number)
Returns
Regressor: self
pop
S.pop([index]) -> item -- remove and return item at index (default last). Raise IndexError if list is empty or index is out of range.
Parameters
- i – defaults to
-1
predict_one
Predict the output of features x
.
Parameters
- x
Returns
The prediction.
remove
S.remove(value) -- remove first occurrence of value. Raise ValueError if the value is not present.
Parameters
- item
reverse
S.reverse() -- reverse IN PLACE