Как задать LTV-таргет и первый когортный бейзлайн
Как задать LTV-таргет и первый когортный бейзлайн
Ответить самому
Сначала сформулируйте ответ как на собеседовании, затем откройте разбор и оцените себя.
Короткий ответ
First define horizon, cohort timestamp and revenue unit. A safe baseline uses only information available at prediction time, then predicts realized future revenue over the chosen horizon.
Полный разбор
Start by fixing the business question. For example, "for users acquired in week W, estimate gross revenue over 365 days from acquisition." Decide whether LTV is gross revenue, net revenue after platform fees, or contribution after acquisition cost. Do not mix those definitions.
Then build training examples as historical snapshots. Each example has a prediction timestamp, features available at that timestamp, and a future target measured over the remaining or full horizon. If the model is meant to predict for new users after one week, it cannot use transaction events from week two onward.
A first baseline can be simple: historical average LTV by acquisition channel, country, subscription product, campaign, app category and acquisition week. A model baseline can be linear regression or gradient boosting on the same snapshot-safe features. Evaluate on later cohorts, not randomly mixed rows, because marketing, seasonality and product behavior drift over time.
For partially observed cohorts, be explicit about censoring. Either wait until the full target is observed for training labels, use survival/retention modeling, or use a mature-cohort target and communicate that young-cohort estimates have higher uncertainty.
Теория
LTV modeling is mostly a target-definition and leakage-control problem before it is a modeling problem.
Типичные ошибки
- Train on features that are only known after the prediction timestamp.
- Compare cohorts with different observation windows as if they were equally mature.
- Mix revenue LTV and profit after CAC without saying which one is optimized.
- Use random row split instead of time/cohort split.
Как отвечать на собеседовании
- Ask what horizon and prediction timestamp matter to the business.
- Say how you would create historical snapshots.