Когда одно дерево решений может обойти Random Forest
Когда одно дерево решений может обойти Random Forest
Ответить самому
Сначала сформулируйте ответ как на собеседовании, затем откройте разбор и оцените себя.
Короткий ответ
If the target depends almost entirely on one strong feature and Random Forest uses feature subsampling, many trees may miss that feature and vote noisily, while one unrestricted tree can split perfectly.
Полный разбор
A concrete example is a small synthetic dataset where one feature perfectly separates the classes and all other features are noise. An unrestricted decision tree can choose the separating feature at the root and achieve near-perfect accuracy.
A Random Forest can underperform if each tree sees only a subset of features and many trees do not include the decisive feature near the root. Those trees split on noise and add bad votes. The ensemble usually reduces variance, but if its randomization systematically hides the only useful signal, it can increase bias or dilute a clean rule.
Other practical cases include very small datasets, high interpretability constraints, temporal leakage where a forest overfits many unstable proxies, or monotonic/business-rule settings where one simple rule is closer to the deployment behavior.
Теория
Ensembles help when diverse weak learners have useful signal; they can hurt when randomization removes the only signal.
Типичные ошибки
- Say “Random Forest is always better”.
- Give a vague small-data answer without explaining the mechanism.
- Ignore feature subsampling and noisy votes.
Как отвечать на собеседовании
- Use the one-perfect-feature example.
- Tie the answer to bias, variance and feature subsampling.