Squared loss, also known as the L2 loss.
Mathematically, it is defined as
It's gradient w.r.t. to \(p_i\) is
One thing to note is that this convention is consistent with Vowpal Wabbit and PyTorch, but not with scikit-learn. Indeed, scikit-learn divides the loss by 2, making the 2 disappear in the gradient.
>>> from river import optim >>> loss = optim.losses.Squared() >>> loss(-4, 5) 81 >>> loss.gradient(-4, 5) 18 >>> loss.gradient(5, -4) -18
Returns the loss.
Return a fresh estimator with the same parameters.
The clone has the same parameters but has not been updated with any data. This works by looking at the parameters from the class signature. Each parameter is either - recursively cloned if it's a River classes. - deep-copied via
copy.deepcopy if not. If the calling object is stochastic (i.e. it accepts a seed parameter) and has not been seeded, then the clone will not be idempotent. Indeed, this method's purpose if simply to return a new instance with the same input parameters.
Return the gradient with respect to y_pred.
This is the inverse of the link function. Typically, a loss function takes as input the raw output of a model. In the case of classification, the raw output would be logits. The mean function can be used to convert the raw output into a value that makes sense to the user, such as a probability.
The adjusted prediction(s).