Lesson 10: Discriminant Analysis

Printer-friendly versionPrinter-friendly version

Introduction

Discriminant analysis is a classification problem, where two or more groups or clusters or populations are known a priori and one or more new observations are classified into one of the known populations based on the measured characteristics. Let us look at three different examples.

Example 1 - Swiss Bank Notes:

We have two populations of bank notes, genuine, and counterfeit. Six measures are taken on each note:

  • Length
  • Right-Hand Width
  • Left-Hand Width
  • Top Margin
  • Bottom Margin
  • Diagonal across the printed area

Take a bank note of unknown origin and determine just from these six measurements whether or not it is real or counterfeit. Perhaps this is not as impractical as it might sound. A more modern equivalent is a scanner that would measure the notes automatically and makes a decision.

Example 2 - Pottery Data:

Pottery shards are sampled from four sites: L) Llanedyrn, C) Caldicot, I) Ilse Thornes, and A) Ashley Rails and the concentrations of the following chemical constituents were measured at a laboratory

  • Al: Aluminum
  • Fe: Iron
  • Mg: Magnesium
  • Ca: Calcium
  • Na: Sodium

An archaeologist encounters a pottery specimen of unknown origin. To determine possible trade routes, the archaeologist may wish to classify its site of origin.

Example 3 - Insect Data:

Data were collected on two species of insects in the genus Chaetocnema, (a) Ch. concinna and (b) Ch. heikertlingeri. Three variables were measured on each insect:

  • width of the 1st joint of the tarsus (legs)
  • width of the 2nd joint of the tarsus
  • width of the aedeagus (sex organ)

Our objective is to obtain a classification rule for identifying the insect species based on these three variables. An entomologist can identify these two closely related species, but the differences are so subtle that one has to have considerable experience to be able to tell the difference. If a classification rule may be developed, then this might be a more accurate way to help differentiate between these two different species.

Learning objectives & outcomes

Upon completion of this lesson, you should be able to do the following:

  • Determine whether linear of quadratic discriminant analysis should be applied to a given data set;
  • Be able to carry out both types of discriminant analyses using SAS/Minitab;
  • Be able to apply the linear discriminant function to classify a subject by its measurements;
  • Understand how to assess the efficacy of a discriminant analysis.