Назад к подготовке

ВопросСредняяmodel-debuggingВопрос про production ML из разбора после собеседования · Adapty Adapty

Поиск срезов, где LTV-модель ошибается

Поиск срезов, где LTV-модель ошибается

Ответить самому

Сначала сформулируйте ответ как на собеседовании, затем откройте разбор и оцените себя.

Загрузка

Короткий ответ

Compute residuals, slice them by app/category/channel/tenure/cohort, compare high-error and low-error populations, then inspect data quality, feature distributions and calibration.

Полный разбор

Start with residuals: prediction minus realized LTV, absolute error and signed error. Aggregate them by app, app category, acquisition channel, country, cohort date, subscription product, user tenure and data volume. Look at both mean error and tail errors because a few bad apps can hide inside a good global average. Then compare high-error slices with low-error slices. Check whether feature distributions differ, labels are delayed or censored, some events are missing, prices changed, campaigns shifted, or the app has unusual retention dynamics. For linear models, inspect coefficients and feature contributions. For tree models, use SHAP or similar local explanations, but still validate with raw data plots. Finally, decide the fix. It might be data repair, a new feature, a segment-specific model, calibrated lower bounds for uncertain slices, reweighting, or simply marking predictions as low-confidence until enough data accumulates.

Теория

Error analysis is a supervised data audit: residuals define the failure set, and slice comparisons turn it into hypotheses.

Типичные ошибки

Look only at global MSE.
Cluster blindly before checking simple business slices.
Use explainability plots without validating raw data quality.
Fix the model when the issue is delayed or missing labels.

Как отвечать на собеседовании

Say signed residuals and absolute residuals separately.
Name several concrete slices and a concrete fix path.