9.1 - Poisson Regression Model

Printer-friendly versionPrinter-friendly version

We first introduce a formal model and then look at two specific examples in SAS and then in R.

In an example using data about crabs we are interested in knowing

  • How does the number of satellites, (male crabs residing near a female crab), for a female horseshoe crab depend on the width of her back?, and
  • What is the rate of satellites per unit width?

In an example using data about credit cards we are interested in knowing

  • What is the expected number of credit cards a person may have, given his/her income?, or
  • What is the sample rate of possession of credit cards?


In Poisson regression Response/outcome variable Y is a count. But we can also have Y/t, the rate (or incidence) as the response variable, where t is an interval representing time, space or some other grouping.

Explanatory Variable(s):

  • Explanatory variables, X = (X1, X2, … Xk), can be continuous or a combination of continuous and categorical variables. Convention is to call such a model “Poisson Regression”.
  • Explanatory variables, X = (X1, X2, … Xk), can be ALL categorical. Then the counts to be modeled are the counts in a contingency table, and the convention is to call such a model log-linear model.
  • If Y/t is the variable of interest then even with all categorical predictors, the regression model will be known as Poisson regression, not a log-linear model.

GLM Model for Counts with its assumptions:

\(g(\mu)=\beta_0+\beta_1 x_1+\beta_2 x_2+\ldots+\beta_k x_k=x_{i}^{T}\beta \)

Random component: Response Y has a Poisson distribution that is $y_i\sim Poisson(\mu_i)$ for $i=1, ..., N$ where the expected count of $y_i$ is \(E(Y)=\mu\).

Systematic component: Any set of X = (X1, X2, … Xk) are explanatory variables. For now let’s focus on a single variable X.


Identity link: \(\mu=\beta_0+\beta_1 x_1\)
Sometimes the identity link function is used in Poisson regression. This model is the same as that used in ordinary regression except that the random component is the Poisson distribution. Issue: can yield μ < 0!

Natural log link: \(log(\mu)=\beta_0+\beta_1 x_1\)
The Poisson regression model for counts is sometimes referred to as a “Poisson loglinear model”. We will focus on this one and a rate model for incidences.

For simplicity, with a single explanatory variable, we write: \(log(\mu)=\alpha+\beta_x\). This is equivalent to: \(\mu=exp(\alpha+\beta_x)=exp(\alpha)exp(\beta_x)\)

Interpretation of Parameter Estimates:

exp(α) = effect on the mean of Y, that is μ, when X = 0

exp(β) = with every unit increase in X, the predictor variable has multiplicative effect of exp(β) on the mean of Y, that is μ

  • If β = 0, then exp(β) = 1, and the expected count, μ = E(y) = exp(α), and Y and X are not related.
  • If β > 0, then exp(β) > 1, and the expected count μ = E(y) is exp(β) times larger than when X = 0
  • If β < 0, then exp(β) < 1, and the expected count μ = E(y) is exp(β) times smaller than when X = 0

GLM Model for Rates:

\(\text{GLM:}g(\mu)=\beta_0+\beta_1 x_1+\beta_2 x_2+\ldots+\beta_k x_k \)

Random component: Response Y has a Poisson distribution, and t is index of the time or space; more specifically the expected value of rate Y/t, is \(E(Y/t)=\mu\) that is \(E(Y)=\mu t\)

Systematic component: Any set of X = (X1, X2, … Xk) can be explanatory variables. For now let’s focus on a single variable X.


Log of rate: log(Y/t)

Poisson loglinear regression model for the expected rate of the occurrence of event is:

\(log(\mu/t)=\alpha+\beta x\)

This can be rearranged to:

\(log(\mu)-log(t)=\alpha+\beta x\)
\(log(\mu)=\alpha+\beta x+log(t)\)

The term  log(t)  is referred to as an offset. It is an adjustment term and a group of observations may have the same offset, or each individual may have a different value of t. log(t) is an observation and it will change the value of estimated counts:

\(\mu=exp(\alpha+\beta x+log(t))=(t) exp(\alpha)exp(\beta_x)\)

This means that mean count is proportional to t.

Note that the interpretation of parameter estimates, α and β will stay the same as for the model of counts; you just need to multiply the expected counts by t.

Parameter Estimation

Similar to the case of Logistic regression, the maximum likelihood estimators (MLEs) for (β0, β1 … etc.) are obtained by finding the values that maximizes log-likelihood.  In general, there are no closed-form solutions, so the ML estimates are obtained by using iterative algorithms such as Newton-Raphson (NR), Iteratively re-weighted least squares (IRWLS), etc.


The usual tools from the basic statistical inference and GLM are valid.

  • Confidence Intervals and Hypothesis tests for parameters
    • Wald statistics and asymptotic standard error (ASE)
    • Likelihood ratio tests
    • Score tests
  • Distribution of probability estimates

Model Fit

  • Overall goodness-of-fit statistics of the model are the same as for any GLM:
    • Pearson chi-square statistic, X2
    • Deviance, G2
    • Likelihood ratio test, and statistic, ΔG2
  • Residual analysis: Pearson, deviance, adjusted residuals, etc...
  • Overdispersion
    • Recall that a Poisson random variable has the same mean and variance, e.g., \(E(Y)=Var(Y)=\mu\)
    • Overdispersion means that observed variance is larger than the assumed variance, i.e., \(Var(Y)=\varphi\mu\) where φ is a scale parameter like we saw in logistic regression.
    • Two typical solutions are:
      • Adjust for overdispersion (like in logistic regression) where we estimate \(\varphi=X^2/(N-p)\), and adjust the standard errors and test statistics.
      • Use negative binomial regression instead (see notes on ANGEL), where the response Y is assumed to follow a Negative Binomial distribution, \(E(Y)=\mu\) and \(Var(Y)=\mu+D\mu^2\). The index D is a called a dispersion parameter. Greater heterogeneity in the Poisson means results in a larger value of D. As D approaches 0, Var(Y) will approach μ , and the negative binomial and Poisson regression will give the same inference. 

In the next couple of pages because the explanations are quite lengthy, we will take a look using the Poisson Regression Model for count data first working with SAS, and then in the next page using R.

In SAS we can use  PROC GENMOD which is a general procedure for fitting any GLM.  Many parts of the input and output will be similar to what we saw with PROC LOGISTIC.  In R we can still use glm().