Newton¶
Online Newton Step (ONS) optimizer.
This optimizer uses second-order information (i.e. the Hessian of the cost function) in addition to first-order information (i.e. the gradient of the cost function). It maintains the matrix
where \(g_s\) is the gradient at step \(s\), and applies the update
The inverse \(A_t^{-1}\) is never computed from scratch. Instead it is updated in-place at each step with the Sherman-Morrison formula (see utils.math.sherman_morrison), which turns the rank-1 update \(A_t = A_{t-1} + g_t g_t^\intercal\) into a single rank-1 update of the inverse. This costs \(O(d^2)\) time and memory, where \(d\) is the number of features seen so far.
Parameters¶
-
lr (mutable)
Default →
0.1Learning rate. This corresponds to \(1 / \gamma\) in the paper.
-
eps
Default →
1e-05Regularization term used to initialize the Hessian to \(\epsilon I\). Its inverse \(1 / \epsilon\) is used as the starting value of \(A_t^{-1}\), so a smaller
epsallows larger steps early on.
Attributes¶
- learning_rate
Examples¶
from river import datasets
from river import evaluate
from river import linear_model
from river import metrics
from river import optim
from river import preprocessing
dataset = datasets.Phishing()
optimizer = optim.Newton()
model = (
preprocessing.StandardScaler() |
linear_model.LogisticRegression(optimizer)
)
metric = metrics.F1()
evaluate.progressive_val_score(dataset, model, metric)
F1: 73.62%
Methods¶
look_ahead
Updates a weight vector before a prediction is made.
Parameters: w (dict): A dictionary of weight parameters. The weights are modified in-place. Returns: The updated weights.
Parameters
- w —
dict
step
Updates a weight vector given a gradient.
Parameters
- w —
dict | VectorLike - g —
dict | VectorLike
Returns
dict | VectorLike: The updated weights.