# Squared¶

Squared loss, also known as the L2 loss.

Mathematically, it is defined as

$L = (p_i - y_i) ^ 2$

It's gradient w.r.t. to $$p_i$$ is

$\frac{\partial L}{\partial p_i} = 2 (p_i - y_i)$

One thing to note is that this convention is consistent with Vowpal Wabbit and PyTorch, but not with scikit-learn. Indeed, scikit-learn divides the loss by 2, making the 2 disappear in the gradient.

## Examples¶

from river import optim

loss = optim.losses.Squared()
loss(-4, 5)

81

loss.gradient(-4, 5)

18

loss.gradient(5, -4)

-18


