Назад к подготовке

Метрики фрод-классификатора при асимметричных ошибках

Метрики фрод-классификатора при асимметричных ошибках

Ответить самому

Сначала сформулируйте ответ как на собеседовании, затем откройте разбор и оцените себя.

Загрузка

Короткий ответ

Use ranking metrics such as PR-AUC/ROC-AUC for model comparison, but choose the operating threshold by business cost, capacity and required precision/recall. With rare fraud, PR-AUC and precision at review capacity are often more informative than accuracy.

Полный разбор

Fraud is usually imbalanced, so accuracy is a weak metric. A model can look accurate by predicting "not fraud" almost everywhere. For model comparison, use ROC-AUC if class balance is moderate, and PR-AUC, precision@k, recall@k or lift when positives are rare.

The production threshold should be chosen from the business trade-off. A false positive may block or review a good user; a false negative may let fraud through. If the approximate costs are known, choose the threshold that minimizes expected cost:

expected cost = FP * cost_fp + FN * cost_fn.

If there is a manual review team, use capacity constraints such as top-k alerts per day and track precision at that capacity. If regulation or user experience imposes limits, add guardrails such as maximum false-positive rate for trusted users.

Always validate on a split that matches deployment time and population. Fraud patterns shift, so threshold and calibration should be monitored after launch.

Теория

The metric should match the decision: ranking quality, fixed-capacity review, or cost-sensitive automatic blocking are different objectives.

Типичные ошибки

  • Optimize accuracy on an imbalanced fraud dataset.
  • Pick ROC-AUC only and never discuss the operating threshold.
  • Ignore the cost difference between blocking good users and missing fraud.

Как отвечать на собеседовании

  • Ask what action follows the score: manual review, block, or soft friction.
  • Mention PR-AUC or precision@k when fraud is rare.