ML System Design
How would you design an LLM-agent loop that checks a task output using tools such as file reading, web access or document inspection?
Ответить самому
Сначала сформулируйте ответ как на собеседовании, затем откройте разбор и оцените себя.
Короткий ответ
Use a planner-checker loop: parse task/spec/output, generate hypotheses to verify, call constrained tools, accumulate evidence, then produce a structured verdict with uncertainty and escalation.
Полный разбор
The first step is context packaging: task spec, worker output, attachments and customer requirements must be normalized into a representation the checker can use. If artifacts are large, retrieve or summarize relevant parts instead of dumping everything into one prompt.
Then run an explicit loop. The model proposes checks such as file exists, table value matches requirement, link accessible, claim supported, image contains requested object. Tools execute those checks with constrained permissions. Results are appended as evidence, and the model either asks for another check or decides.
Production guardrails matter: limit tool calls, make tool outputs deterministic, log every check, keep a timeout budget and escalate uncertain cases. The final answer should be a structured verdict grounded in tool evidence, not just the model’s unsupported opinion.
Типичные ошибки
- Use a single LLM call and call it an agent.
- Give the agent unbounded tools and no budget.
- Fail to log evidence for later reviewer audits.