Назад к подготовке

ВопросСредняяclassification-metricsВопрос по метрикам из разбора после собеседования · Wheely Wheely

Метрики фрод-классификатора при асимметричных ошибках

Метрики фрод-классификатора при асимметричных ошибках

Ответить самому

Сначала сформулируйте ответ как на собеседовании, затем откройте разбор и оцените себя.

Загрузка

Короткий ответ

Use ranking metrics such as PR-AUC/ROC-AUC for model comparison, but choose the operating threshold by business cost, capacity and required precision/recall. With rare fraud, PR-AUC and precision at review capacity are often more informative than accuracy.

Полный разбор

Fraud is usually imbalanced, so accuracy is a weak metric. A model can look accurate by predicting "not fraud" almost everywhere. For model comparison, use ROC-AUC if class balance is moderate, and PR-AUC, precision@k, recall@k or lift when positives are rare. The production threshold should be chosen from the business trade-off. A false positive may block or review a good user; a false negative may let fraud through. If the approximate costs are known, choose the threshold that minimizes expected cost: expected cost = FP * cost_fp + FN * cost_fn. If there is a manual review team, use capacity constraints such as top-k alerts per day and track precision at that capacity. If regulation or user experience imposes limits, add guardrails such as maximum false-positive rate for trusted users. Always validate on a split that matches deployment time and population. Fraud patterns shift, so threshold and calibration should be monitored after launch.

Теория

The metric should match the decision: ranking quality, fixed-capacity review, or cost-sensitive automatic blocking are different objectives.

Типичные ошибки

Optimize accuracy on an imbalanced fraud dataset.
Pick ROC-AUC only and never discuss the operating threshold.
Ignore the cost difference between blocking good users and missing fraud.

Как отвечать на собеседовании

Ask what action follows the score: manual review, block, or soft friction.
Mention PR-AUC or precision@k when fraud is rare.