Назад к подготовке

Что делать, если продукт хочет модель, а данных нет

Что делать, если продукт хочет модель, а данных нет

Ответить самому

Сначала сформулируйте ответ как на собеседовании, затем откройте разбор и оцените себя.

Загрузка

Короткий ответ

Clarify the business decision, build a baseline or pretrained-model prototype, estimate value, design labeling or weak supervision, and avoid committing to a heavy model before data proves it is useful.

Полный разбор

First clarify what decision the model should improve, what metric matters and how much error the business can tolerate. Often the first deliverable should be a baseline, rules, analytics or a pretrained-model prototype rather than a full custom model.

If the domain has usable pretrained models, test them on a small real sample. If labels are missing, design a labeling loop: human annotation, product events as proxy labels, weak supervision, active learning or LLM/VLM-assisted pre-labeling with human review. At the same time, estimate the value of collecting the data and the time needed to know whether the idea works.

The important senior answer is to control investment. Sometimes the right answer is to not build ML yet, because a dashboard, heuristic or product change gives faster business value while data collection ramps up.

Теория

No-data ML starts with problem framing and value validation, not model architecture.

Типичные ошибки

  • Promise a custom model before labels and success criteria exist.
  • Ignore non-ML baselines.
  • Collect data without knowing whether it can change a business decision.

Как отвечать на собеседовании

  • Talk about the labeling plan and first baseline.
  • Explicitly mention when you would stop the project.