ML System Design
Design an automatic system that checks whether a human/agent task result is good enough before delivery to a customer. How do you frame the ML problem?
Ответить самому
Сначала сформулируйте ответ как на собеседовании, затем откройте разбор и оцените себя.
Короткий ответ
Frame it as quality-control risk estimation plus structured error detection: given task spec, worker output and available evidence, decide accept/reject/manual-review and produce actionable error reasons.
Полный разбор
Start with inputs: task description, customer requirements, worker or agent output, attachments, logs and possibly historical reviewer decisions. Outputs should not be only a scalar score. A practical checker returns accept/reject/manual-review plus structured error reasons and evidence.
Define the decision policy around costs. False accept means a bad result reaches the customer; false reject means extra cost, delay and possibly unfair worker feedback. Manual review is a constrained fallback, so the model should optimize quality under a review-budget constraint.
A reasonable first version combines rules and LLM-based checking: validate required files/fields, check format, compare output to task requirements, flag fraud or hallucination signals, and send uncertain/high-risk cases to humans. Later versions can train classifiers or fine-tune models from reviewer labels.
Типичные ошибки
- Make the output a vague quality score with no reason codes.
- Ignore false accept versus false reject cost asymmetry.
- Forget manual-review capacity as a hard constraint.