Linear regression, in one form or another, is one of the most widely used statistical techniques for examining the effects of some group of explanatory (predictor) variables on an outcome variable. We use multi linear regression analysis when our dependent variable is either continuous, a scale variable, or interval level. The independent variable can be any level (continuous, a scale variable, interval level, or categorical). However, any independent variables that are categorical have to be code in a very specific way called dummy coding.
One of the most import parts of regression analysis is choosing the variables that should be in the model. Always use theory to develop your regression models. The variables in our models are also influence by our literature review (i.e. what variables have been used in the past).
Here’s the basic drop down menu path to run a regression model.
Analyze Regression Linear choose your dependent variable and choose your independent variables (if there is a reason you want to enter the independent variables in a particular order or in steps then use the next button) Statistics click on R squared change continue paste or OK.
Class example for states data (STATES07withSouth.sav). The dependent variable is going to be crime rate CRS28. I am having you enter the independent variables in 2 steps. In the first step I want you to enter the variable South and in the second step enter poverty rate (PVS493) and high school dropout rate (EDS130). Often we enter independent variables in 2 or more steps to test what we call mediated models (mediated models are beyond the scope of this class) or to see the added effect of an additional variable. I am having you use 2 steps to demonstrate what happened to the slopes (b and beta) when multicollinearity is present (When independent variables are correlated with each other; they are almost always correlated).