BayesianLinearRegression¶
Bayesian linear regression.
An advantage of Bayesian linear regression over standard linear regression is that features do not have to scaled beforehand. Another attractive property is that this flavor of linear regression is somewhat insensitive to its hyperparameters. Finally, this model can output instead a predictive distribution rather than just a point estimate.
The downside is that the learning step runs in O(n^2)
time, whereas the learning step of standard linear regression takes O(n)
time.
Parameters¶
-
alpha
Default →
1
Prior parameter.
-
beta
Default →
1
Noise parameter.
-
smoothing
Type → float
Default →
None
Smoothing allows the model to gradually "forget" the past, and focus on the more recent data. It thus enables the model to deal with concept drift. Due to the current implementation, activating smoothing may slow down the model.
Examples¶
from river import datasets
from river import evaluate
from river import linear_model
from river import metrics
dataset = datasets.TrumpApproval()
model = linear_model.BayesianLinearRegression()
metric = metrics.MAE()
evaluate.progressive_val_score(dataset, model, metric)
MAE: 0.586...
x, _ = next(iter(dataset))
model.predict_one(x)
43.852...
model.predict_one(x, with_dist=True)
𝒩(μ=43.85..., σ=1.00...)
The smoothing
parameter can be set to make the model robust to drift. The parameter is
expected to be between 0 and 1. To exemplify, let's generate some simulation data with an
abrupt concept drift right in the middle.
import itertools
import random
def random_data(coefs, n, seed=42):
rng = random.Random(seed)
for _ in range(n):
x = {i: rng.random() for i, c in enumerate(coefs)}
y = sum(c * xi for c, xi in zip(coefs, x.values()))
yield x, y
Here's how the model performs without any smoothing:
model = linear_model.BayesianLinearRegression()
dataset = itertools.chain(
random_data([0.1, 3], 100),
random_data([10, -2], 100)
)
metric = metrics.MAE()
evaluate.progressive_val_score(dataset, model, metric)
MAE: 1.284...
And here's how it performs with some smoothing:
model = linear_model.BayesianLinearRegression(smoothing=0.8)
dataset = itertools.chain(
random_data([0.1, 3], 100),
random_data([10, -2], 100)
)
metric = metrics.MAE()
evaluate.progressive_val_score(dataset, model, metric)
MAE: 0.159...
Smoothing allows the model to gradually "forget" the past, and focus on the more recent data.
Note how this works better than standard linear regression, even when using an aggressive learning rate.
from river import optim
model = linear_model.LinearRegression(optimizer=optim.SGD(0.5))
dataset = itertools.chain(
random_data([0.1, 3], 100),
random_data([10, -2], 100)
)
metric = metrics.MAE()
evaluate.progressive_val_score(dataset, model, metric)
MAE: 0.242...
Methods¶
learn_one
Fits to a set of features x
and a real-valued target y
.
Parameters
- x — 'dict'
- y — 'base.typing.RegTarget'
predict_many
predict_one
Predict the output of features x
.
Parameters
- x — 'dict'
- with_dist — defaults to
False
Returns
base.typing.RegTarget: The prediction.