ROC-AUC: построение и интерпретация
ROC-AUC: построение и интерпретация
Ответить самому
Сначала сформулируйте ответ как на собеседовании, затем откройте разбор и оцените себя.
Короткий ответ
Sweep the classification threshold, plot TPR against FPR, and take the area under that curve. ROC-AUC is the probability that a random positive gets a higher score than a random negative.
Полный разбор
For every threshold on the model score, compute true positive rate and false positive rate. Plot FPR on the x-axis and TPR on the y-axis. The area under this curve is ROC-AUC.
The ranking interpretation is usually the cleanest: ROC-AUC equals the probability that a randomly chosen positive example receives a higher score than a randomly chosen negative example, with ties handled according to the implementation.
ROC-AUC is threshold-independent and useful for comparing ranking quality, but it can be misleading under heavy class imbalance or when the business cares about a narrow high-precision operating region. In those cases also inspect PR-AUC, precision/recall at a target threshold and calibration.
Теория
ROC-AUC measures ranking quality over all thresholds, not calibrated probability quality at one threshold.
Типичные ошибки
- Confuse ROC-AUC with accuracy.
- Forget FPR is FP / all negatives.
- Use ROC-AUC alone for rare-event decisions.
Как отвечать на собеседовании
- State the pairwise probability interpretation.
- Mention PR-AUC for imbalanced problems.