Вопрос
Describe how you would train and validate a transformer-style reranking model for marketplace recommendations.
Ответить самому
Сначала сформулируйте ответ как на собеседовании, затем откройте разбор и оцените себя.
Короткий ответ
Build candidate lists, create labels from interactions, sample negatives carefully, train a sequence/cross-feature ranking model, validate offline with ranking metrics and ship only through guarded online experiments.
Полный разбор
A reranker operates after candidate generation. Start by defining what it ranks: item-item recommendations, user-item candidates, search results or session-based candidates. Training data usually comes from impressions/clicks/orders/favorites/watch events, with time-based splits to avoid leakage.
Negative sampling matters. Random negatives are easy but often too easy; sampled displayed-but-not-clicked items, hard negatives from the retrieval stage and category-aware negatives can make the task closer to production. The model can be a sequence model such as SASRec/BERT4Rec, a two-tower plus cross features, or another transformer-style reranker depending on latency budget.
Offline metrics should match the ranked-list behavior: NDCG@K, Recall@K, MRR, MAP or task-specific conversion proxies. Offline wins are not enough because logs are biased by previous rankers and UI exposure. The final decision needs online A/B metrics such as CTR, conversion, retention, revenue or marketplace guardrails.
Теория
Reranking quality depends as much on logged-data construction and evaluation design as on the model architecture.
Типичные ошибки
- Train on future interactions by accident.
- Use random negatives only and get a misleadingly easy offline task.
- Optimize NDCG offline but ignore online business guardrails.
Как отвечать на собеседовании
- Say what the candidate generator provides and what the reranker changes.
- Name both ranking metrics and the online product metric used for launch.