What happens to the bias and variance of a linear regression model as the regularization parameter (lambda) increases?
Bias decreases, Variance increases
Bias decreases, Variance decreases
Bias increases, Variance increases
Bias increases, Variance decreases
Why might centering or scaling independent variables be insufficient to completely resolve multicollinearity?
It doesn't address the fundamental issue of high correlations between the variables.
It requires a large sample size to be effective.
It can make the model more complex and harder to interpret.
It only works for linear relationships between variables.
When using Principal Component Analysis (PCA) as a remedy for multicollinearity, what is the primary aim?
To remove all independent variables from the model
To create new, uncorrelated variables from the original correlated ones
To increase the sample size of the dataset
To introduce non-linearity into the model
If a linear regression model has an Adjusted R-squared value of 0.85, what does it indicate about the goodness of fit?
The model is overfitting the data.
The model explains 85% of the variation in the dependent variable, accounting for the number of predictors.
The model's predictions will be accurate 85% of the time.
The model explains 15% of the variation in the dependent variable.
How do GLMs handle heteroscedasticity, a situation where the variance of residuals is not constant across the range of predictor values?
They implicitly account for it by allowing the variance to be a function of the mean.
They require data transformations to stabilize variance before analysis.
They use non-parametric techniques to adjust for heteroscedasticity.
They ignore heteroscedasticity as it doesn't impact GLM estimations.
What does a high Cook's distance value indicate?
The observation has both high leverage and high influence
The observation has low leverage but high influence
The observation has high leverage but low influence
The observation is not an outlier
What is a primary risk of using high-degree polynomials in Polynomial Regression?
It can lead to overfitting, where the model learns the training data too well but fails to generalize to new data.
It makes the model too simple and reduces its ability to capture complex patterns.
It reduces the computational cost of training the model.
It always improves the model's performance on unseen data.
Which regularization technique adds a penalty term proportional to the sum of the squared absolute values of the coefficients to the cost function?
Elastic Net
Linear Regression
Lasso Regression
Ridge Regression
Which metric is in the same units as the dependent variable, making it easier to interpret directly?
RMSE
R-squared
MAE
Adjusted R-squared
What is the primary advantage of using Adjusted R-squared over R-squared when evaluating linear regression models?
Adjusted R-squared always increases when new predictors are added.
Adjusted R-squared penalizes the inclusion of irrelevant variables.
Adjusted R-squared is easier to interpret than R-squared.
Adjusted R-squared is less sensitive to outliers compared to R-squared.