Назад к подготовке

Градиентный бустинг, остатки и диапазон предсказаний

Градиентный бустинг, остатки и диапазон предсказаний

Ответить самому

Сначала сформулируйте ответ как на собеседовании, затем откройте разбор и оцените себя.

Загрузка

Короткий ответ

Boosting adds trees that approximate negative gradients of the loss, not raw targets. Because predictions are sums of many gradient steps, a boosted regressor can move outside the original target range.

Полный разбор

Gradient boosting builds an additive model. Start with an initial prediction, then repeatedly fit a weak learner to the negative gradient of the loss with respect to current predictions. For MSE, that gradient is proportional to the residual y - y_hat, so the intuition of fitting residuals is correct.

The leaf values in later trees are not simply averages of original y values. They are fitted updates, often gradients or Newton-style leaf estimates depending on the implementation and objective. The final prediction is the initial value plus learning-rate-scaled contributions from all trees.

A random forest regressor averages target values in leaves and then averages trees, so for standard settings it tends to stay within the range of training targets. Gradient boosting is a sum of updates; it can overshoot and predict outside the observed target range, especially with many trees, high learning rate or objectives that allow such updates.

Теория

The word gradient matters: boosting optimizes loss in function space rather than bagging target averages.

Типичные ошибки

  • Say each boosting leaf stores only averaged y values.
  • Explain gradient boosting exactly like random forest.
  • Forget the learning rate in the additive prediction.
  • Assume tree-based regressors can never extrapolate outside target range.

Как отвечать на собеседовании

  • For MSE, derive the residual as the negative gradient.
  • Contrast boosting with random forest leaf averaging.