Назад к подготовке

Вопрос по метрикам

How would you evaluate the full search pipeline and its individual components offline and online?

Ответить самому

Сначала сформулируйте ответ как на собеседовании, затем откройте разбор и оцените себя.

Загрузка

Короткий ответ

Evaluate components with proxy labels and the whole pipeline with business A/B metrics. Always slice by geo, price, property type, query class, paid/organic mix and latency.

Полный разбор

Component metrics answer whether each block works: attribute extraction F1, geo parsing accuracy, retrieval recall@K, ranker nDCG/MRR, calibration and coverage. These metrics are useful for debugging but do not prove product value.

The full pipeline needs online evaluation: CTR, listing opens, favorites, calls/messages, qualified leads, bookings/purchases, revenue and retention. Guardrails should include latency, empty results, complaint rate, diversity, long-tail coverage, fairness between paid and organic content and seller-side liquidity.

Slicing matters because real-estate search is heterogeneous. A model can improve Moscow rentals and hurt regional daily rentals, or improve cheap apartments and hurt premium new builds. Good evaluation reports slice by city/region, price band, property type, transaction type, query intent and paid-content exposure.

Теория

Component metrics debug the system; online marketplace metrics validate the system.

Типичные ошибки

  • Use only offline nDCG and skip A/B testing.
  • Average away important city and price-band regressions.
  • Ignore latency and empty-result guardrails.

Как отвечать на собеседовании

  • Say “components offline, full pipeline online”.
  • Name at least five important slices.