ML System Design
If you train on feedback from the previous recommender, what biases can appear and how can you reduce them?
Ответить самому
Сначала сформулируйте ответ как на собеседовании, затем откройте разбор и оцените себя.
Короткий ответ
The logged feedback is biased by what the old model chose to expose, item popularity and positions. Mitigate with exploration traffic, exposure-aware labels, propensity weighting, randomized buckets and slice monitoring.
Полный разбор
A recommender only observes feedback for items it exposed. That creates selection bias. Popular items get more exposure, high positions get more clicks, and items the old model never showed may look bad only because nobody saw them.
Mitigations start with logging. Store impressions, positions, candidate source, user/item context and whether the item was actually visible. Treat "no click" differently when there was no exposure.
Add controlled exploration: small random or diversified traffic, epsilon-greedy buckets, interleaving, or candidate-source mixing. Use propensity weighting or debiasing methods when estimating offline value from logged data. Monitor popularity, novelty, category and cold-start slices so the model does not collapse into safe popular recommendations.
For item-to-item recommendations, also separate global item similarity from personalized feedback. If the product wants the same similar items for all users, aggregate feedback at the anchor-item pair level; if it wants personalization, introduce user context explicitly.
Теория
Logged recommender data is not an unbiased sample of user preferences; it is a sample of exposed model decisions.
Типичные ошибки
- Treat every unclicked item as a true negative without exposure context.
- Ignore position bias.
- Add exploration without measuring user-impact guardrails.
Как отвечать на собеседовании
- Say "previous model exposure bias" directly.
- Bring up impression logging before fancy debiasing methods.