Назад к подготовке

ВопросСредняяclassification-metricsВопрос по метрикам на техническом собеседовании · Diagnocat Diagnocat

ROC-AUC, ранжирующая интерпретация и бинаризованные предсказания

ROC-AUC, ранжирующая интерпретация и бинаризованные предсказания

Ответить самому

Сначала сформулируйте ответ как на собеседовании, затем откройте разбор и оцените себя.

Загрузка

Короткий ответ

ROC-AUC measures ranking quality or area under TPR/FPR over thresholds. PR-AUC is often more informative for rare positives. With binary predictions, ROC has only one interior operating point.

Полный разбор

ROC-AUC can be explained two ways. As a curve, sweep the decision threshold over model scores and plot TPR against FPR. As a ranking metric, ROC-AUC is the probability that a random positive receives a higher score than a random negative, with tie handling. For severe class imbalance, PR-AUC is often more useful when the positive class is the product-critical rare class, because precision directly reflects false positives among predicted positives. ROC-AUC can look deceptively good when there are many true negatives. If you pass already-binarized predictions instead of scores, the ROC curve still can be computed, but it has only one meaningful interior operating point plus the endpoints. It loses threshold-ranking information and is usually less informative than using raw scores or probabilities.

Теория

ROC-AUC is threshold-free only when the input preserves score ordering.

Типичные ошибки

Use class labels when scores are available.
Treat ROC-AUC as a calibrated probability metric.
Ignore PR-AUC for rare-positive tasks.

Как отвечать на собеседовании

Give both curve and pairwise-ranking interpretations.
Say what information binarization destroys.