К обычному разбору
Тренировка по собеседованиюСобеседованиеinDrive2025-10-03

inDrive: Финальное собеседование

Идите сверху вниз: сначала попробуйте сами, затем откройте разбор. Если шаг с кодом, пишите решение прямо здесь и запускайте проверки на странице.

Шагов
1
Вопросов
1
Задач
0
1Кейс18 мин

ML System Design

Design a semantic search layer for geo/address suggestions where users can type categories like "cafe" and expect restaurants, POIs and relevant addresses across many languages.

Ответьте без подсказки

Сначала проговорите ответ вслух или тезисами.

Запишите черновик

Формулы, план решения, риски и примеры.

Сравните с разбором

Откройте разбор только после своей попытки.

Показать разбор

Короткий ответ

Combine structured geo data, category normalization, multilingual query understanding, lexical/vector retrieval, deduplication, ranking and latency-aware serving around the existing search engine.

Подробный разбор

Start from data: OSM, internal address/POI data, partner data and Overture-style datasets. Normalize names, addresses, categories, languages, coordinates, polygons and provenance. Deduplicate across providers and keep freshness pipelines.

For retrieval, keep lexical search for exact addresses and POI names, then add category and semantic matching. Category understanding can be a multilingual classifier or embedding model mapping queries and synonyms to POI categories. Vector retrieval can complement OpenSearch/BM25, but exact address behavior must not regress.

Ranking should combine distance, textual match, category confidence, popularity, availability, viewport/city context, safety/geofence constraints and provider confidence. Serving must respect autocomplete latency, so heavy ML is often offline enrichment or a cached lightweight online call.

Типичные ошибки

  • Replace exact address search with embeddings only.
  • Ignore multilingual synonyms and provider deduplication.
  • Add an online Python model call in autocomplete without latency design.

Как сказать на собеседовании

  • Separate exact address search from semantic category search.
  • Mention data normalization and deduplication before model choice.