LLM Fine-tuning and Post-training
SFT, PEFT, LoRA/QLoRA, preference optimization and when not to fine-tune because RAG/prompting/eval gates may be better.
Что должен уметь кандидат
- Distinguish full fine-tuning, PEFT/LoRA, SFT, DPO and model merging at a practical level.
- Design post-training plan with dataset spec, eval gates and rollback criteria.
- Understand adapter storage/loading benefits without overclaiming quality.
- Know when RAG or prompt changes are safer than changing weights.
Что спрашивают на собеседовании
- When would you use RAG instead of fine-tuning?
- What can go wrong with LoRA target module selection?
- How do you validate an instruction-tuned model?
Практическая задача
Build a PEFT LoRA plan: dataset, base model, target modules, eval set, rollback criteria and adapter deployment path.
Source-grounded правило
Avoid claims that LoRA always preserves quality; adaptation results depend on data, target modules, rank and eval coverage.