Назад к подготовке

Вопрос по метрикам

A binary image classifier is trained with BCE loss. On validation, accuracy rises but BCE loss also rises. Can this happen and what are plausible causes?

Ответить самому

Сначала сформулируйте ответ как на собеседовании, затем откройте разбор и оцените себя.

Загрузка

Короткий ответ

Yes. Accuracy only checks the thresholded class, while BCE also penalizes confidence. A few confidently wrong examples can increase loss while more examples cross the threshold correctly.

Полный разбор

This can happen because accuracy and BCE measure different things. Accuracy is threshold-based: after choosing a threshold such as 0.5, it only counts whether each prediction is on the correct side. BCE uses the predicted probability, so confidence matters.

Suppose more validation examples move from 0.49 to 0.51 for the correct class. Accuracy improves. At the same time, a small number of mislabeled, shifted or hard examples can receive very confident wrong probabilities, such as 0.999 for the wrong class. BCE on those examples can grow sharply and dominate the average loss.

Plausible causes include label noise, validation/train domain shift, overconfident miscalibration, or a distribution slice where the model becomes confidently wrong. A strong answer proposes inspecting the confusion matrix, per-slice loss, calibration curves and mislabeled examples.

Теория

Classification metrics can move in different directions when one metric thresholds scores and another uses probability magnitude.

Типичные ошибки

  • Assume validation loss and accuracy must always be monotonic together.
  • Ignore confidence and calibration.
  • Look only at the aggregate metric instead of per-slice failures.

Как отвечать на собеседовании

  • Build a tiny counterexample with one confidently wrong sample.
  • Name both label noise and domain shift as practical explanations.