Назад к подготовке

ML System Design

Design an automatic system that checks whether a human/agent task result is good enough before delivery to a customer. How do you frame the ML problem?

Ответить самому

Сначала сформулируйте ответ как на собеседовании, затем откройте разбор и оцените себя.

Загрузка

Короткий ответ

Frame it as quality-control risk estimation plus structured error detection: given task spec, worker output and available evidence, decide accept/reject/manual-review and produce actionable error reasons.

Полный разбор

Start with inputs: task description, customer requirements, worker or agent output, attachments, logs and possibly historical reviewer decisions. Outputs should not be only a scalar score. A practical checker returns accept/reject/manual-review plus structured error reasons and evidence.

Define the decision policy around costs. False accept means a bad result reaches the customer; false reject means extra cost, delay and possibly unfair worker feedback. Manual review is a constrained fallback, so the model should optimize quality under a review-budget constraint.

A reasonable first version combines rules and LLM-based checking: validate required files/fields, check format, compare output to task requirements, flag fraud or hallucination signals, and send uncertain/high-risk cases to humans. Later versions can train classifiers or fine-tune models from reviewer labels.

Типичные ошибки

  • Make the output a vague quality score with no reason codes.
  • Ignore false accept versus false reject cost asymmetry.
  • Forget manual-review capacity as a hard constraint.