Squared loss, also known as the L2 loss.
Mathematically, it is defined as
It's gradient w.r.t. to \(p_i\) is
One thing to note is that this convention is consistent with Vowpal Wabbit and PyTorch, but not with scikit-learn. Indeed, scikit-learn divides the loss by 2, making the 2 disappear in the gradient.
from river import optim loss = optim.losses.Squared() loss(-4, 5)
Returns the loss.
Return the gradient with respect to y_pred.
This is the inverse of the link function. Typically, a loss function takes as input the raw output of a model. In the case of classification, the raw output would be logits. The mean function can be used to convert the raw output into a value that makes sense to the user, such as a probability.
The adjusted prediction(s).