Lesson 6: Logistic Regression
Thus far our focus has been on describing interactions or associations between two or three categorical variables mostly via single summary statistics and with significance testing. From this lesson on, we will focus on modeling. Models can handle more complicated situations and analyze the simultaneous effects of multiple variables, including mixtures of categorical and continuous variables. In Lesson 6 and Lesson 7, we study the binary logistic regression, which we will see is an example of a generalized linear model.
Binary Logistic Regression is a special type of regression where binary response variable is related to a set of explanatory variables, which can be discrete and/or continuous. The important point here to note is that in linear regression, the expected values of the response variable are modeled based on combination of values taken by the predictors. In logistic regression Probability or Odds of the response taking a particular value is modeled based on combination of values taken by the predictors. Like regression (and unlike loglinear models that we will see later), we make an explicit distinction between a response variable and one or more predictor (explanatory) variables. We begin with twoway tables, then progress to threeway tables, where all explanatory variables are categorical. Then we introduce binary logistic regression with continuous predictors as well. In the last part we will focus on more model diagnostics and model selection.
Logistic regression is applicable, for example, if:
 we want to model the probabilities of a response variable as a function of some explanatory variables, e.g. "success" of admission as a function of gender.
 we want to perform descriptive discriminate analyses such as describing the differences between individuals in separate groups as a function of explanatory variables, e.g. student admitted and rejected as a function of gender
 we want to predict probabilities that individuals fall into two categories of the binary response as a function of some explanatory variables, e.g. what is the probability that a student is admitted given she is a female
 we want to classify individuals into two categories based on explanatory variables, e.g. classify new students into "admitted" or "rejected" group depending on their gender.
Key Concepts
Objectives

Useful Links
 SAS online help on PROC LOGISTIC: http://support.sas.com/onlinedoc/913/getDoc/en/statug.hlp/logistic_index.htm
 SAS online various logistic regression examples: http://support.sas.com/onlinedoc/913/getDoc/en/statug.hlp/logistic_sect53.htm
 SAS PROC LOGISTIC and overdispersion http://support.sas.com/onlinedoc/913/getDoc/en/statug.hlp/logistic_sect35.htm# stat_logistic_logisticod
 SAS logistic regression example with PROC GENMOD: http://support.sas.com/onlinedoc/913/getDoc/en/statug.hlp/genmod_sect45.htm
Readings
 Agresti (2007) Ch 3, Sec 3.13.2, Ch. 4, Ch. 5
 Agresti (2013) Ch 4, Sec 4.14.2, Ch. 5, Ch. 6; more advanced Ch 4, Sec 4.44.7, and Ch. 7
To complete this lesson you should :
 Read the online course material
 Read suggested textbook pages
 Complete the Discussion questions and exercises placed throughout the online Lesson 6 material