Skip to content

EpsilonGreedyRegressor

Model selection based on the \(\eps\)-greedy bandit strategy.

Performs model selection by using an \(\eps\)-greedy bandit strategy. A model is selected for each learning step. The best model is selected (1 - \(\eps\)%) of the time.

Selection bias is a common problem when using bandits for online model selection. This bias can be mitigated by using a burn-in phase. Each model is given the chance to learn during the first burn_in steps.

Parameters

  • models

    The models to choose from.

  • metric – defaults to None

    The metric that is used to compare models with each other. Defaults to metrics.MAE.

  • epsilon – defaults to 0.1

    The fraction of time exploration is performed rather than exploitation.

  • decay – defaults to 0.0

    Exponential factor at which epsilon decays.

  • burn_in – defaults to 100

    The number of initial steps during which each model is updated.

  • seed (int) – defaults to None

    Random number generator seed for reproducibility.

Attributes

  • best_model

    The current best model.

  • burn_in

  • decay

  • epsilon

  • models

  • seed

Examples

>>> from river import datasets
>>> from river import evaluate
>>> from river import linear_model
>>> from river import metrics
>>> from river import model_selection
>>> from river import optim
>>> from river import preprocessing

>>> models = [
...     linear_model.LinearRegression(optimizer=optim.SGD(lr=lr))
...     for lr in [0.0001, 0.001, 1e-05, 0.01]
... ]

>>> dataset = datasets.TrumpApproval()
>>> model = (
...     preprocessing.StandardScaler() |
...     model_selection.EpsilonGreedyRegressor(
...         models,
...         epsilon=0.1,
...         decay=0.001,
...         burn_in=100,
...         seed=1
...     )
... )
>>> metric = metrics.MAE()

>>> evaluate.progressive_val_score(dataset, model, metric)
MAE: 1.363516

>>> model['EpsilonGreedyRegressor'].bandit
Ranking   MAE         Pulls   Share
     #2   15.850129     111    8.53%
     #1   13.060601     117    8.99%
     #3   16.519079     109    8.38%
     #0    1.387839     964   74.10%

>>> model['EpsilonGreedyRegressor'].best_model
LinearRegression (
  optimizer=SGD (
    lr=Constant (
      learning_rate=0.01
    )
  )
  loss=Squared ()
  l2=0.
  l1=0.
  intercept_init=0.
  intercept_lr=Constant (
    learning_rate=0.01
  )
  clip_gradient=1e+12
  initializer=Zeros ()
)

Methods

append

S.append(value) -- append value to the end of the sequence

Parameters

  • item
clear

S.clear() -> None -- remove all items from S

copy
count

S.count(value) -> integer -- return number of occurrences of value

Parameters

  • item
extend

S.extend(iterable) -- extend sequence by appending elements from the iterable

Parameters

  • other
index

S.index(value, [start, [stop]]) -> integer -- return first index of value. Raises ValueError if the value is not present.

Supporting start and stop arguments is optional, but recommended.

Parameters

  • item
  • args
insert

S.insert(index, value) -- insert value before index

Parameters

  • i
  • item
learn_one

Fits to a set of features x and a real-valued target y.

Parameters

  • x (dict)
  • y (numbers.Number)

Returns

Regressor: self

pop

S.pop([index]) -> item -- remove and return item at index (default last). Raise IndexError if list is empty or index is out of range.

Parameters

  • i – defaults to -1
predict_one

Predict the output of features x.

Parameters

  • x

Returns

The prediction.

remove

S.remove(value) -- remove first occurrence of value. Raise ValueError if the value is not present.

Parameters

  • item
reverse

S.reverse() -- reverse IN PLACE

sort

References