Назад к подготовке

ВопросСложнаяrankingML System Design из разбора после собеседования · CIAN CIAN

ML System Design

How would you train the ranker for real-estate search, choose negatives, and blend paid monetized listings without destroying relevance?

Ответить самому

Сначала сформулируйте ответ как на собеседовании, затем откройте разбор и оцените себя.

Загрузка

Короткий ответ

Train on exposed query-listing pairs using graded actions, sampled and impression-based negatives, and position-bias correction. Blend monetization with relevance through expected value and hard relevance/quality guardrails.

Полный разбор

The ranker row should represent a listing in a query and user context. Labels can be graded: impression with no action, open, favorite, message, call, qualified lead, booking or purchase. Exposure logs are important because unshown listings are not clean negatives. Negatives can include random listings, shown-but-not-clicked impressions and hard negatives from the same city/price/type that were skipped in favor of clicked listings. Position bias must be handled with features, debiasing, counterfactual logging or cautious interpretation. Pairwise/listwise losses often fit ranking better than pure pointwise classification, but production simplicity matters. Monetization should not be a raw score bonus. Better framing is expected marketplace value: relevance probability times business value, with constraints on relevance, geography, quality, paid quota, user harm and seller fairness. Keep organic and paid candidates eligible only when they satisfy the query; then blend with calibrated scores and A/B-test revenue, lead quality and user guardrails.

Теория

Ranking labels are biased by exposure; monetization must be constrained by relevance to preserve marketplace trust.

Типичные ошибки

Treat every non-click as an equally strong negative.
Add bid directly to relevance score without calibration.
Optimize revenue while ignoring user and seller-side guardrails.

Как отвечать на собеседовании

Use graded labels and exposed impressions.
Say “expected value with relevance guardrails” for paid listings.