Tutorial 4
Multiple Linear Regression
Contents
Assumed knowledge
Downloads
qfsall.sav
Introduction
General steps
The general recommended strategy for tackling an MLR analysis is:
- Conceptualise model (e.g., draw diagram - Path or Venn)
- Recode predictors (if necessary) as dummy variables
- Check assumptions
- Conduct standard, hierarchical, stepwise, forward, or backward MLR
- Interpret statistical findings and psychological meaning of results,
taking into account:
- R, R2, Adjusted R2,
sig. of R
- ∆ in R and its sig.
(i.e., if more than one model)
- Standardised (B) and
unstandardised (β (beta)) regression
coefficients
- Zero-order and partial correlations for each IV in each model
- If relevant and useful, interpret Y-intercept and write a regression equation
for predicting Y
Checking assumptions
- IVs: Two or more interval/ratio or dichotomous variables
- DV: One interval/ratio variable
- Do you have enough data?
(min. 5 cases per predictor, ideally 20 cases per predictor, with an
overall N of at least 100; this allows sufficient power to detect a
medium ES of R square of .13 (Francis, p. 128))
- Are the variables normally distributed?
(Check histograms of all variables in an analysis;
MLR is reasonably robust to violations of this assumption)
- Are the bivariate relationships linear?
(Check scatterplots and correlations between Y and each of Xs)
- Is there homoscedasticity?
(Check scatterplots between Y and each of Xs and/or
check scatterplot of the residuals (ZRESID) and predicted values
(ZPRED))
- Is there multicollinearity between Xs?
(Check scatterplots and correlations between each of the Xs
and/or check the collinearity statistics in
the Coefficients table. The Variance Inflation Factor (VIF)
should be < 3 and tolerance should be > .3.)
- Are there multivariate outliers?
(Check Mahalanobis' Distance and Cook's D)
- See also Francis 5.1.4 Practical Issues and Assumptions (pp.
126-128)
Checking for Multivariate Outliers
- 5.1.4.2 Screening for influential case
To check whether there are influential multivariate outlying cases using
Mahalanobis distance & Cook’s D:
- Linear Regression -
Save - Mahalanobis and Cook's D - OK
- SPSS will
create new variables called mah_1 and coo_1.
- Check the Residuals Statistics
table in the output for the maximum Mahalanobis and Cook’s distances.
- The max Mahalanobis distance should not be greater than the critical
chi-square value with degrees of freedom equal to number of
predictors, with crit
ical alpha =.001.
Cook’s D should not be greater than 1.
If outliers are detected, check each case, and consider removing
the case from the analysis.
Francis exercises
- SCHL8.sav
- 5.1.1 Standard - worked example (pp. 116-120)
- 5.1 Standard - Exercise (p. 120, 256)
- 5.1.2 Hierarchical - worked example (pp. 120-122)
- 5.2 Hierarchical - Exercise (p. 122, 256)
- 5.1.3 Stepwise - worked example (pp. 123-125)
- 5.3 Stepwise - Exercise (p. 125, 257)
- 5.4 MLR - Exercise (p. 128, 257)
Design your Own MLR
- Follow the general steps to conduct a meaningful MLR analysis of your own design
(qfsall.sav - or any other data set).
- Interpret the analysis - explain it to someone next to
you or your tutor.
Design your Own MLR with Dummy Coding
- Conduct a meaningful MLR analysis which requires you to recode one
or more of the variables as dummy variables.
MLR Methods
- Re-do any of your previous MLRs using each of Direct, Forward,
Backward, Stepwise, and Hierarchical methods.
- Why or why there aren't differences between the results for each
method?
Partial Correlations & Venn
Diagrams
- For any previous MLR analysis, create stand-alone correlations and depict the
relationships in a Venn Diagram.
In action: Life expectancy
- If you're curious about how multiple regression can be
used for prediction, then try using a
Life
Expectancy Calculator (this one takes about 15 minutes):
- What could you
do to improve your life expectancy?
- What do you estimate the unstandardised regression coefficients
to be for each of the variables the calculator recommends you could
change in order to increase your life expectancy?
- What do you think the an 80% confidence interval would be for
your life expectancy estimate? (can be estimated via this very short
Life Expectancy Calculator (takes <1 minute).
See also