Полная программа
Подробная программа теперь отображается как роадмап: проходите темы по этапам, открывайте материалы и отмечайте прогресс.
Advanced ML Engineering
Продвинутый roadmap для MLE/Research Engineer ролей: distributed training, LLM engineering, GenAI/multimodal systems, inference optimization and foundation-model data pipelines.
Прогресс
0 из 21 тем🧩 Distributed Training
0/4Как обучать большие модели на multi-GPU и multi-node инфраструктуре без магического мышления.
Distributed Training Foundations
DDP, gradient synchronization, effective batch size, communication cost, NCCL and basic multi-node failure modes.
FSDP, DeepSpeed ZeRO and Sharding
Why optimizer states dominate memory, how FSDP/ZeRO shard params, gradients and optimizer state, and when sharding pays off.
Parallelism and Memory Engineering
Tensor, pipeline and sequence parallelism; activation checkpointing; gradient accumulation; BF16/FP16; memory budget accounting.
Training Stability and Checkpointing
NaNs, loss spikes, mixed precision instability, sharded checkpoints, resume semantics and reproducibility for long training runs.
🧠 LLM Engineering
0/4Engineering layer for large language models: scaling, post-training, serving, KV-cache, evaluation and cost.
LLM Scaling and Architecture
Decoder-only transformers, MoE, long context, KV-cache implications, scaling laws and practical architecture trade-offs.
LLM Fine-tuning and Post-training
LoRA, QLoRA, PEFT, SFT, preference optimization and practical risk management for domain adaptation.
LLM serving: KV-cache и батчинг
Как LLM отвечает токен за токеном: prefill/decode, KV-cache, continuous batching, метрики задержки и выбор vLLM/SGLang/TensorRT-LLM/TGI.
LLM Evaluation, Latency and Cost
Offline evals, human preference, LLM-as-judge limits, hallucination checks, token economics and latency/cost trade-offs.
🎨 Generative and Multimodal Models
0/5Diffusion, flow matching, image/video/audio generation, conditioning and evaluation for modern GenAI systems.
Generative Modeling Foundations
GAN/VAE context, diffusion fundamentals, latent spaces, conditioning and why production GenAI is not just sampling pretty images.
Diffusion, Flow Matching and DiT
DDPM/DDIM/SDE vocabulary, rectified flow, flow matching, Diffusion Transformers and few-step sampling quality/latency trade-offs.
Multimodal Conditioning
Text, image, video, pose, depth, segmentation, audio and reference conditioning via cross-attention, adapters and ControlNet-style branches.
Video and Audio Generation
Text-to-video, image-to-video, temporal modeling, identity preservation, audio generation and modality-specific failure modes.
GenAI Evaluation
FID, FVD, CLIPScore, VBench, temporal consistency, identity preservation, human preference and safety regression suites.
⚡ Inference Optimization and High-Load Serving
0/4How to make large ML systems fast, reliable and economically survivable in production.
Inference Optimization Foundations
Latency, throughput, memory, cost, profiling, bottleneck attribution, batching trade-offs and hardware-aware thinking.
Runtime Optimization Stack
ONNX Runtime, TensorRT, Triton, torch.compile, quantization and when each layer of the stack is worth the complexity.
High-Load Serving Patterns
Async APIs, queues, streaming, cancellation, continuous batching, GPU scheduling, autoscaling and graceful degradation.
Latency, Cost and Observability
p50/p95/p99, queue depth, GPU utilization, cost per request, model regressions and product-facing reliability metrics.
🗄️ Foundation Model Data Pipelines
0/4Data curation, filtering, deduplication, sharding and streaming loaders for large-scale foundation-model training.
Foundation Model Data Pipelines
Collection, licensing, preprocessing, metadata, captioning, filtering and reproducible dataset versions for large models.
Data Curation, Deduplication and Filtering
Near-duplicate search, quality scoring, unsafe content filtering, caption quality and why data quality can dominate architecture changes.
Streaming DataLoaders and Storage
WebDataset, object storage, tar shards, shuffle quality, DALI/NVDEC, prefetching and avoiding GPU starvation.
Multimodal Data Governance
Video/audio/image/text governance: consent, PII, likeness abuse, synthetic media provenance and safety filters.