Назад к подготовке

ВопросСложнаяml-system-designML System Design на скрининге · Focus / Teramind

ML System Design

A video analytics product watches kitchen staff and must check whether people follow location-specific safety protocols. The system needs kitchen rules, time of day and staff context. How would you design the approach?

Ответить самому

Сначала сформулируйте ответ как на собеседовании, затем откройте разбор и оцените себя.

Загрузка

Короткий ответ

Model it as a rules-aware video pipeline: detect people/actions/objects, attach time/staff/kitchen policy context, evaluate protocol violations with a deterministic rules layer or constrained LLM, and use human review plus monitoring for edge cases.

Полный разбор

Start by separating perception from policy. The perception layer extracts structured events from video: people, uniforms, handwashing, knives, hats, restricted zones, timestamps and object interactions. Use lightweight CV models for frequent online checks and stronger VLMs or humans for bootstrapping labels and hard cases. The policy layer should not be hidden inside a generic prompt. Store kitchen-specific rules, shift schedules and staff permissions in a structured rules engine or retrieval-backed policy store. Convert perception events into facts, then evaluate rules such as “staff member X must wear item Y in zone Z during shift T” or “knife must be washed after raw-food contact.” LLMs can help map natural-language policy documents to structured rules, summarize incidents and assist review, but the production decision path needs traceability. For deployment, sample frames/clips intelligently, keep latency and privacy constraints explicit, log evidence clips for violations, and measure precision/recall by rule type, camera angle, lighting and site.

Теория

Robust video compliance systems work best as structured event extraction plus auditable policy evaluation, not as a single opaque video-to-answer model.

Типичные ошибки

Run a large VLM on every frame without a compute plan.
Treat kitchen-specific rules as hard-coded global assumptions.
Ignore evidence logging and human appeal workflows.
Skip identity, shift and privacy constraints.

Как отвечать на собеседовании

Say “perception layer” and “policy layer” explicitly.
Mention weak labels from VLMs/humans but a cheaper online model for production.