Разбор training loop на PyTorch для многоклассовой классификации
Разбор training loop на PyTorch для многоклассовой классификации
Ответить самому
Сначала сформулируйте ответ как на собеседовании, затем откройте разбор и оцените себя.
Короткий ответ
Check architecture, batching, labels, optimizer step/zero_grad, train/eval loops, loss inputs, device transfer, validation, batch size, epochs and metrics.
Полный разбор
A code-review answer should separate correctness from style. Correctness issues include whether Dataset returns objects that the default collate function can batch, whether the model receives tensors rather than custom records, whether labels are passed to the loss, and whether the optimizer is created, zeroed, stepped and tied to model.parameters().
For multiclass classification with CrossEntropyLoss, the model should normally return raw logits, not softmax probabilities, because the loss applies log-softmax internally. Labels should be class indices with the right dtype and range. The training loop should run for multiple epochs, set model.train(), move the batch to the target device in the loop, and have a separate validation loop with model.eval() and no_grad().
Design issues include a weak linear-only image architecture, missing batch_size, no metrics/logging, no validation split, no seed/reproducibility strategy and unclear input-shape assumptions.
Теория
Train-loop review is about data contract, model/loss contract and optimization lifecycle.
Типичные ошибки
- Apply softmax before CrossEntropyLoss.
- Forget labels or pass images as targets.
- Move data to GPU inside Dataset workers.
Как отвечать на собеседовании
- Start with runtime-breaking bugs before architecture opinions.
- Mention the raw-logits contract for CrossEntropyLoss.