Важность признаков в линейных моделях при мультиколлинеарности
Важность признаков в линейных моделях при мультиколлинеарности
Ответить самому
Сначала сформулируйте ответ как на собеседовании, затем откройте разбор и оцените себя.
Короткий ответ
Coefficient magnitude is interpretable only after feature scaling and under stable feature relationships. Correlated or linearly dependent features can split or flip weights, so use regularization, diagnostics, feature removal or transformed features carefully.
Полный разбор
A coefficient says how the prediction changes when that feature changes by one unit while other features are fixed. Because units matter, comparing raw coefficient magnitudes is usually invalid. Standardize numerical features before using coefficient magnitude as a rough importance signal.
Even after scaling, coefficients can mislead under multicollinearity. If features are strongly correlated or linearly dependent, many coefficient combinations can explain the same signal. Weights may become unstable, split across duplicate features, change sign, or depend heavily on regularization.
Mitigations include checking correlation and VIF-like diagnostics, removing redundant features, using L1 or L2 regularization, grouping features, or transforming the space with PCA. PCA can reduce collinearity, but it also reduces direct feature interpretability because coefficients now apply to components rather than original business features.
Теория
Linear models are interpretable only relative to feature scale, feature dependence and the chosen regularization.
Типичные ошибки
- Compare coefficients before scaling features.
- Assume a large coefficient always means high business importance.
- Forget that correlated features make individual coefficients unstable.
- Use PCA and still talk about original feature-level coefficients.
Как отвечать на собеседовании
- Mention scaling first.
- Use multicollinearity as the main counterexample.