ROC-AUC, ранжирующая интерпретация и бинаризованные предсказания
ROC-AUC, ранжирующая интерпретация и бинаризованные предсказания
Ответить самому
Сначала сформулируйте ответ как на собеседовании, затем откройте разбор и оцените себя.
Короткий ответ
ROC-AUC measures ranking quality or area under TPR/FPR over thresholds. PR-AUC is often more informative for rare positives. With binary predictions, ROC has only one interior operating point.
Полный разбор
ROC-AUC can be explained two ways. As a curve, sweep the decision threshold over model scores and plot TPR against FPR. As a ranking metric, ROC-AUC is the probability that a random positive receives a higher score than a random negative, with tie handling.
For severe class imbalance, PR-AUC is often more useful when the positive class is the product-critical rare class, because precision directly reflects false positives among predicted positives. ROC-AUC can look deceptively good when there are many true negatives.
If you pass already-binarized predictions instead of scores, the ROC curve still can be computed, but it has only one meaningful interior operating point plus the endpoints. It loses threshold-ranking information and is usually less informative than using raw scores or probabilities.
Теория
ROC-AUC is threshold-free only when the input preserves score ordering.
Типичные ошибки
- Use class labels when scores are available.
- Treat ROC-AUC as a calibrated probability metric.
- Ignore PR-AUC for rare-positive tasks.
Как отвечать на собеседовании
- Give both curve and pairwise-ranking interpretations.
- Say what information binarization destroys.