Пройти собеседование: Adapty: Техническое собеседование

1Вопрос12 мин

Как задать LTV-таргет и первый когортный бейзлайн

Ответьте без подсказки

Сначала проговорите ответ вслух или тезисами.

Запишите черновик

Формулы, план решения, риски и примеры.

Сравните с разбором

Откройте разбор только после своей попытки.

Открыть отдельную страницу вопроса

Показать разбор

Короткий ответ

First define horizon, cohort timestamp and revenue unit. A safe baseline uses only information available at prediction time, then predicts realized future revenue over the chosen horizon.

Подробный разбор

Start by fixing the business question. For example, "for users acquired in week W, estimate gross revenue over 365 days from acquisition." Decide whether LTV is gross revenue, net revenue after platform fees, or contribution after acquisition cost. Do not mix those definitions.

Then build training examples as historical snapshots. Each example has a prediction timestamp, features available at that timestamp, and a future target measured over the remaining or full horizon. If the model is meant to predict for new users after one week, it cannot use transaction events from week two onward.

A first baseline can be simple: historical average LTV by acquisition channel, country, subscription product, campaign, app category and acquisition week. A model baseline can be linear regression or gradient boosting on the same snapshot-safe features. Evaluate on later cohorts, not randomly mixed rows, because marketing, seasonality and product behavior drift over time.

For partially observed cohorts, be explicit about censoring. Either wait until the full target is observed for training labels, use survival/retention modeling, or use a mature-cohort target and communicate that young-cohort estimates have higher uncertainty.

Типичные ошибки

Train on features that are only known after the prediction timestamp.
Compare cohorts with different observation windows as if they were equally mature.
Mix revenue LTV and profit after CAC without saying which one is optimized.
Use random row split instead of time/cohort split.

Как сказать на собеседовании

Ask what horizon and prediction timestamp matter to the business.
Say how you would create historical snapshots.

2Вопрос10 мин

LTV-метрики, когда бизнесу нужна консервативная оценка

Ответьте без подсказки

Сначала проговорите ответ вслух или тезисами.

Запишите черновик

Формулы, план решения, риски и примеры.

Сравните с разбором

Откройте разбор только после своей попытки.

Открыть отдельную страницу вопроса

Показать разбор

Короткий ответ

MSE optimizes point estimates, but acquisition decisions need downside risk. Add lower-bound or quantile metrics, calibration checks and business-threshold error analysis.

Подробный разбор

MSE is useful for average point-prediction accuracy, but it treats overprediction and underprediction symmetrically. Marketing budget decisions are often asymmetric: overestimating LTV can make the company buy unprofitable traffic, while underestimating LTV may only leave some growth on the table.

A better evaluation includes risk-aware metrics. Train or evaluate lower quantiles with pinball loss, build prediction intervals, or report a conservative lower confidence bound for cohort LTV. Then check whether the lower bound is calibrated: for a claimed 10th percentile, roughly 10% of realized outcomes should fall below it.

Also evaluate decisions directly. If CAC is known, measure how often the model recommends buying traffic that later turns unprofitable, expected profit under the policy, and performance by channel/country/app segment. This connects model quality to the business action instead of only to a regression score.

Типичные ошибки

Optimize only MSE and ignore overprediction risk.
Report confidence intervals without calibration checks.
Evaluate rows, but not the acquisition decision the model changes.
Use one global threshold while acquisition economics differ by channel.

Как сказать на собеседовании

Mention quantile regression or lower confidence bounds.
Tie the metric to CAC and expected profit.

3Вопрос10 мин

Признаки из истории подписок для частично наблюдаемых пользователей

Ответьте без подсказки

Сначала проговорите ответ вслух или тезисами.

Запишите черновик

Формулы, план решения, риски и примеры.

Сравните с разбором

Откройте разбор только после своей попытки.

Открыть отдельную страницу вопроса

Показать разбор

Короткий ответ

Use RFM-style features plus subscription-state features: total paid, number of renewals, tenure, active status, days since last payment, payment regularity, plan price and normalized rates.

Подробный разбор

For partially observed users, aggregate the transaction sequence as of the prediction timestamp. Useful features include total revenue so far, number of paid periods, tenure since acquisition, active subscription flag, current plan, last payment date, days since last payment, renewal streak length, gaps between payments and average monthly revenue so far.

The key is to preserve time shape, not only totals. Two users with the same number of payments are different if one paid recently and the other stopped months ago. Recency, active status, gaps and normalized features such as paid months divided by observed tenure help separate those cases.

Avoid leakage by computing all features only from events before the snapshot. Handle young users separately or include exposure time, because a user acquired yesterday has fewer possible renewals than a user observed for six months. For subscription LTV, survival-style features or hazard models can also be useful when churn timing matters.

Типичные ошибки

Use total payments only and ignore when those payments happened.
Compare users with different observation windows without tenure features.
Include future renewals in training features.
Treat currently active and long-churned users with the same totals as identical.

Как сказать на собеседовании

Say “recency, frequency, monetary value, tenure and active state”.
Call out snapshot-time feature generation.

4Вопрос14 мин

Моделирование LTV по многим приложениям через эмбеддинги и сегменты

Ответьте без подсказки

Сначала проговорите ответ вслух или тезисами.

Запишите черновик

Формулы, план решения, риски и примеры.

Сравните с разбором

Откройте разбор только после своей попытки.

Открыть отдельную страницу вопроса

Показать разбор

Короткий ответ

Start with a global model plus app/category features, then compare per-app or clustered variants by time-based validation, app-level residuals and enough-data thresholds.

Подробный разбор

A global model is usually the first production baseline because it shares statistical strength across apps and works for small clients. Add app-level features such as category, geography mix, monetization type, price points, marketing channels, product age and coarse app embeddings from text, metadata or visual assets when available.

Per-app models are useful only when an app has enough data and sufficiently different behavior to justify losing shared data. Clustered models are a compromise: apps with similar category, pricing and retention patterns share a model. Hierarchical or shrinkage approaches are often better than hard splits because small apps can borrow signal from the global prior.

The decision should be empirical. Use time-based validation and report metrics by app, app size and category. Compare global, global-plus-app-id, clustered and per-app variants. If splitting improves large apps but hurts small apps, use data-volume gates or blend predictions. Also check operational cost: many models require versioning, monitoring and rollback per segment.

Типичные ошибки

Create per-app models before proving the global model fails by segment.
Ignore small apps with little data.
Use app id as a magic fix without cold-start handling.
Validate by random rows instead of future cohorts or held-out apps.

Как сказать на собеседовании

Use “global model first, then segment by measured residuals” as the backbone.
Mention shrinkage or blending for low-data apps.

5Вопрос10 мин

Поиск срезов, где LTV-модель ошибается

Ответьте без подсказки

Сначала проговорите ответ вслух или тезисами.

Запишите черновик

Формулы, план решения, риски и примеры.

Сравните с разбором

Откройте разбор только после своей попытки.

Открыть отдельную страницу вопроса

Показать разбор

Короткий ответ

Compute residuals, slice them by app/category/channel/tenure/cohort, compare high-error and low-error populations, then inspect data quality, feature distributions and calibration.

Подробный разбор

Start with residuals: prediction minus realized LTV, absolute error and signed error. Aggregate them by app, app category, acquisition channel, country, cohort date, subscription product, user tenure and data volume. Look at both mean error and tail errors because a few bad apps can hide inside a good global average.

Then compare high-error slices with low-error slices. Check whether feature distributions differ, labels are delayed or censored, some events are missing, prices changed, campaigns shifted, or the app has unusual retention dynamics. For linear models, inspect coefficients and feature contributions. For tree models, use SHAP or similar local explanations, but still validate with raw data plots.

Finally, decide the fix. It might be data repair, a new feature, a segment-specific model, calibrated lower bounds for uncertain slices, reweighting, or simply marking predictions as low-confidence until enough data accumulates.

Типичные ошибки

Look only at global MSE.
Cluster blindly before checking simple business slices.
Use explainability plots without validating raw data quality.
Fix the model when the issue is delayed or missing labels.

Как сказать на собеседовании

Say signed residuals and absolute residuals separately.
Name several concrete slices and a concrete fix path.

6Вопрос8 мин

Когда пробовать бустинг для прогноза LTV

Ответьте без подсказки

Сначала проговорите ответ вслух или тезисами.

Запишите черновик

Формулы, план решения, риски и примеры.

Сравните с разбором

Откройте разбор только после своей попытки.

Открыть отдельную страницу вопроса

Показать разбор

Короткий ответ

Try boosting when important effects are nonlinear, categorical, thresholded or interaction-heavy. Keep the linear baseline and promote boosting only if validation and business metrics improve.

Подробный разбор

Gradient boosting often helps tabular LTV problems because it can model nonlinear effects, thresholds and feature interactions without manually specifying every transformation. It is especially strong with categorical features, mixed numeric/categorical data, missing values and business rules such as country-channel-plan interactions.

It is less compelling if the dataset is tiny, the signal is mostly linear, latency or explainability requirements are strict, or the labels are too noisy to justify a more flexible model. A more flexible model can also overfit recent campaign artifacts.

The decision should be measured. Keep linear regression as the baseline, use time/cohort validation, compare MSE/MAE plus conservative business metrics, inspect calibration and slice performance, then run an online or shadow evaluation if the offline result is meaningful. Model complexity is justified by stable lift, not by the fact that boosting is fashionable.

Типичные ошибки

Switch to boosting without a baseline.
Validate randomly and accidentally leak future cohort behavior.
Ignore calibration and business threshold performance.
Assume a more complex model is automatically better for production.

Как сказать на собеседовании

Name nonlinearities and interactions as the reason.
Say how you would validate the model before rollout.

7Вопрос14 мин

Cold start и плавный LTV-переход для нового приложения

Ответьте без подсказки

Сначала проговорите ответ вслух или тезисами.

Запишите черновик

Формулы, план решения, риски и примеры.

Сравните с разбором

Откройте разбор только после своей попытки.

Открыть отдельную страницу вопроса

Показать разбор

Короткий ответ

Use a prior from similar apps or a global model, then blend it with app-specific evidence using a data-volume or uncertainty-based weight that grows smoothly over time.

Подробный разбор

For day zero, use a prior. The prior can come from a global model using app metadata, a nearest-neighbor app cluster, category-level historical LTV, or a hierarchical model. It should also carry uncertainty, not only a point estimate.

As transactions accumulate, compute an app-specific estimate or model prediction from the new app's own data. Blend it with the prior:

\hat{LTV} = (1 - w_n) \cdot prior + w_n \cdot app\_estimate

where $w_n$ increases with effective sample size, observed paid users, cohort maturity or inverse uncertainty. A Bayesian shrinkage view is often cleaner: small samples stay close to the prior, while mature apps are allowed to move toward their own measured behavior.

To avoid dashboard jumps, smooth the weight schedule, cap daily movement if the product requires stability, and display confidence. Also monitor whether the prior is biased for particular categories so the cold-start estimate does not create systematic overbuying or underbuying of traffic.

Типичные ошибки

Replace the prior abruptly after the first few transactions.
Use nearest neighbor without validating category-level bias.
Hide uncertainty from the dashboard consumer.
Let one small early cohort dominate the app-level estimate.

Как сказать на собеседовании

Use the words prior, evidence and weight schedule.
Mention effective sample size as a blending knob.

8Вопрос10 мин

Пайплайн обучения и деплоя LTV-модели

Как перевести исследовательский ноутбук с LTV-моделью в воспроизводимое обучение, хранение версий, деплой и инференс/API предсказаний?

Ответьте без подсказки

Сначала проговорите ответ вслух или тезисами.

Запишите черновик

Формулы, план решения, риски и примеры.

Сравните с разбором

Откройте разбор только после своей попытки.

Открыть отдельную страницу вопроса

Показать разбор

Короткий ответ

Move reusable code from notebooks into versioned pipelines: feature extraction, training, validation, artifact registry, batch or API serving, monitoring and rollback.

Подробный разбор

A notebook is good for exploration, but production needs a reproducible pipeline. The usual stages are data extraction, feature generation, train/validation split by time, model training, metric computation, artifact packaging, registry write and deployment or batch publish.

Store the model together with feature schema, preprocessing version, training data snapshot, hyperparameters, metrics and compatibility metadata. A service or batch job should load only approved artifacts. If the product needs online predictions, define an API contract and latency budget. If it needs dashboard values, a scheduled batch table may be simpler and more robust.

Monitoring should cover data freshness, row counts, feature drift, prediction distribution, model latency, error rates, realized LTV when labels mature, and business guardrails. Rollback should be possible by moving the active model pointer back to the previous approved artifact.

Типичные ошибки

Deploy code from a notebook without reproducible feature generation.
Store a model without its feature schema and training metadata.
Monitor task success but not data freshness or prediction drift.
Have no rollback path for a bad model publish.

Как сказать на собеседовании

Separate research notebook, training pipeline and serving path.
Say artifact plus schema plus metrics, not just pickle file.

9Вопрос9 мин

Вопрос про production ML

When are SQL window functions useful, how are they different from GROUP BY, and what ClickHouse MergeTree details matter when writing analytical queries?

Ответьте без подсказки

Сначала проговорите ответ вслух или тезисами.

Запишите черновик

Формулы, план решения, риски и примеры.

Сравните с разбором

Откройте разбор только после своей попытки.

Открыть отдельную страницу вопроса

Показать разбор

Короткий ответ

Window functions compute row-level analytics over a partition without collapsing rows. In ClickHouse MergeTree, partitioning and ORDER BY keys strongly affect data pruning and scan efficiency.

Подробный разбор

GROUP BY collapses many rows into one row per group. Window functions keep the original row granularity while computing values over a related set of rows, such as row number, rank, lag, lead, running sum, rolling average or per-user cumulative revenue.

For example, to rank events inside each user history, use $ROW\_NUMBER() OVER (PARTITION BY user_id ORDER BY event_time)$ . To compute cumulative revenue, use a window ordered by payment timestamp. These are hard to express cleanly with plain GROUP BY because the output still needs each row.

In ClickHouse, MergeTree tables are stored in sorted parts. The PARTITION BY expression controls coarse partition pruning, while ORDER BY defines the primary sorting key and sparse index. Queries are much faster when filters use partition keys and leading ORDER BY columns. For ReplacingMergeTree or similar engines, understand that background merges are asynchronous and FINAL can be expensive. Good analytical SQL should match the table's sort order instead of forcing wide full scans.

Типичные ошибки

Use GROUP BY when row-level output is still needed.
Forget PARTITION BY in a window and compute global ranks by accident.
Filter ClickHouse on columns unrelated to partition/order keys and expect pruning.
Use FINAL casually on large tables.

Как сказать на собеседовании

Give one practical window example such as row number, lag or cumulative sum.
For ClickHouse, say partition pruning and ORDER BY leading columns.