Log loss is the average negative log(probability of correct class label).

Log Loss is used in classification and Log Loss uses probability scores.

X Y Y^(Pi) X1 1 0.9 X2 1 0.6 X3 0 0.1 X4 0 0.4

We have Test set for n points,

Log loss is defined as

Let’s understand this,

When Yi=1, Pi=0.9,

we put values in the formula given above, we get

log-loss = -log(0.9)+(1-1)*log(1-0.6)

log-loss = -log(0.9)

log-loss = 0.0457

Now when Yi=1, Pi=0.6

log-loss = -log(0.6) + (1-1)*log(1-0.6)

log-loss = -log(0.6)

log-loss = 0.22

Now let’s do this for some negative class, When Yi=0, Pi=0.1

log-loss = -( log(0.1)*0 + (1-0)*log(1-0.1))

log-loss = -log(0.9)

log-loss = 0.0457

Similarily for Yi=0, Pi=0.4

log-loss = -log(0.6)

log-loss = 0.22

Now, we can see, if Pi value is high then log loss is small, so this model is penalizing for small deviations in probability score.

Similarily for negative class, for small values of probability, we have lower log loss.

We want our log loss to be as small as possible.

Log loss can lie between 0 to infinity where 0 is the best score.

Multi-class log-loss for c classes: where

Yij =1 if Xi belongs class j

Yij = 0, otherwise

and Pij is the probability that Xi belongs to class j.