К обычному разбору
Тренировка по собеседованиюСкринингFocus / Teramind2025-09-02

Focus / Teramind: Скрининг

Идите сверху вниз: сначала попробуйте сами, затем откройте разбор. Если шаг с кодом, пишите решение прямо здесь и запускайте проверки на странице.

Шагов
2
Вопросов
2
Задач
0
1Вопрос15 мин

ML System Design

A video analytics product watches kitchen staff and must check whether people follow location-specific safety protocols. The system needs kitchen rules, time of day and staff context. How would you design the approach?

Ответьте без подсказки

Сначала проговорите ответ вслух или тезисами.

Запишите черновик

Формулы, план решения, риски и примеры.

Сравните с разбором

Откройте разбор только после своей попытки.

Показать разбор

Короткий ответ

Model it as a rules-aware video pipeline: detect people/actions/objects, attach time/staff/kitchen policy context, evaluate protocol violations with a deterministic rules layer or constrained LLM, and use human review plus monitoring for edge cases.

Подробный разбор

Start by separating perception from policy. The perception layer extracts structured events from video: people, uniforms, handwashing, knives, hats, restricted zones, timestamps and object interactions. Use lightweight CV models for frequent online checks and stronger VLMs or humans for bootstrapping labels and hard cases.

The policy layer should not be hidden inside a generic prompt. Store kitchen-specific rules, shift schedules and staff permissions in a structured rules engine or retrieval-backed policy store. Convert perception events into facts, then evaluate rules such as “staff member X must wear item Y in zone Z during shift T” or “knife must be washed after raw-food contact.”

LLMs can help map natural-language policy documents to structured rules, summarize incidents and assist review, but the production decision path needs traceability. For deployment, sample frames/clips intelligently, keep latency and privacy constraints explicit, log evidence clips for violations, and measure precision/recall by rule type, camera angle, lighting and site.

Типичные ошибки

  • Run a large VLM on every frame without a compute plan.
  • Treat kitchen-specific rules as hard-coded global assumptions.
  • Ignore evidence logging and human appeal workflows.
  • Skip identity, shift and privacy constraints.

Как сказать на собеседовании

  • Say “perception layer” and “policy layer” explicitly.
  • Mention weak labels from VLMs/humans but a cheaper online model for production.
2Вопрос12 мин

Вопрос по метрикам

A retail video analytics model should flag suspicious behavior, but humans do not fully agree on what “suspicious” means. How would you define success and evaluate whether the system is doing a good job?

Ответьте без подсказки

Сначала проговорите ответ вслух или тезисами.

Запишите черновик

Формулы, план решения, риски и примеры.

Сравните с разбором

Откройте разбор только после своей попытки.

Показать разбор

Короткий ответ

First turn “suspicious” into operational categories and severity levels, measure human agreement, build a labeled review set with adjudication, then optimize risk-calibrated precision/recall and downstream business outcomes.

Подробный разбор

If humans disagree, the first task is not model selection; it is label design. Define categories of suspicious behavior, severity, required action and non-goals. Measure inter-annotator agreement and keep a third-party or senior-review adjudication process for hard examples.

Evaluation should mix ML metrics and product metrics. Offline, use a stratified video set across stores, camera positions, time of day and traffic level. Track precision, recall, false alarms per hour, missed high-severity incidents, calibration and performance by subgroup or environment. If labels remain ambiguous, report soft labels or agreement-weighted metrics rather than pretending there is one perfect ground truth.

Online, measure analyst workload, alert acceptance rate, time to incident review, customer/store outcomes and appeal rate. Thresholds should be risk-based: high-confidence severe events can alert immediately; uncertain low-severity events can go to passive review or sampling. Monitoring must watch drift by store layout, seasonality, camera changes and policy changes.

Типичные ошибки

  • Optimize accuracy on a noisy label without defining the action.
  • Ignore annotator disagreement.
  • Use one global threshold for all severities and stores.
  • Forget false alarms per hour and reviewer workload.

Как сказать на собеседовании

  • Start by defining the label and action.
  • Bring up inter-annotator agreement and adjudication.