Вопрос по метрикам
How would you evaluate the full search pipeline and its individual components offline and online?
Ответить самому
Сначала сформулируйте ответ как на собеседовании, затем откройте разбор и оцените себя.
Короткий ответ
Evaluate components with proxy labels and the whole pipeline with business A/B metrics. Always slice by geo, price, property type, query class, paid/organic mix and latency.
Полный разбор
Component metrics answer whether each block works: attribute extraction F1, geo parsing accuracy, retrieval recall@K, ranker nDCG/MRR, calibration and coverage. These metrics are useful for debugging but do not prove product value.
The full pipeline needs online evaluation: CTR, listing opens, favorites, calls/messages, qualified leads, bookings/purchases, revenue and retention. Guardrails should include latency, empty results, complaint rate, diversity, long-tail coverage, fairness between paid and organic content and seller-side liquidity.
Slicing matters because real-estate search is heterogeneous. A model can improve Moscow rentals and hurt regional daily rentals, or improve cheap apartments and hurt premium new builds. Good evaluation reports slice by city/region, price band, property type, transaction type, query intent and paid-content exposure.
Теория
Component metrics debug the system; online marketplace metrics validate the system.
Типичные ошибки
- Use only offline nDCG and skip A/B testing.
- Average away important city and price-band regressions.
- Ignore latency and empty-result guardrails.
Как отвечать на собеседовании
- Say “components offline, full pipeline online”.
- Name at least five important slices.