Назад к подготовке

ВопросСредняяmodel-optimizationВопрос про production ML из разбора после собеседования · CIAN CIAN

Сжатие моделей и catastrophic forgetting

Сжатие моделей и catastrophic forgetting

Ответить самому

Сначала сформулируйте ответ как на собеседовании, затем откройте разбор и оцените себя.

Загрузка

Короткий ответ

Compression options include quantization, pruning and distillation; adaptation can use LoRA/adapters. Forgetting is detected on held-out general benchmarks and reduced with replay data, regularization to the base model and parameter-efficient fine-tuning.

Полный разбор

Compression and adaptation are different levers. Quantization reduces numeric precision, pruning removes weights or structures, distillation trains a smaller student from a stronger teacher, and LoRA/adapters update a small number of parameters while keeping the base mostly fixed. Catastrophic forgetting shows up when fine-tuning improves the new domain but degrades general capabilities or old-domain benchmarks. You need an evaluation suite with both new-domain and old-domain tasks, plus slices for safety-critical or business-critical behavior. Common mitigations are replaying a controlled mixture of old-domain data, using teacher logits or reference answers, regularizing the new model toward the base model, lowering learning rates, early stopping, and using LoRA or adapters instead of full fine-tuning. The trade-off should be measured: do not preserve old capabilities so aggressively that the model fails the target domain.

Теория

Forgetting is an evaluation problem first: without old-domain checks, fine-tuning regressions are invisible.

Типичные ошибки

Only evaluate on the new domain.
Treat LoRA as a guarantee against all forgetting.
Distill without checking that the teacher is reliable on the target data.

Как отвечать на собеседовании

Say “new-domain metrics plus old-domain regression suite”.
Name replay data and parameter-efficient tuning as mitigations.