8.2. Discriminant Analysis
Discriminant Analysis is used to determine whether a given classification of cases into a number of groups is an appropriate one. The Discriminant Analysis can be used, for instance, to test whether a particular clustering of cases obtained from a Cluster Analysis is likely. It will report whether the group assignment of a case is true or false, as well as reporting the probability of the case belonging to a particular group.
The variables to be analysed are selected from the data matrix by clicking on [Variable]. A factor column containing the group assignments of cases must also be selected by clicking on [Factor]. This is typically a string variable or numeric variable containing integers. The program will not proceed unless a factor column is selected.
Two types of Discriminant Analysis are provided: Multiple (including Linear and Canonical Discriminant Functions, which can be stepwise) and nonparametric Kth neighbour (also known as K-NN) discriminant analyses.
Multicollinearity: Existence of a solution depends on the number of degrees of freedom available in data. If there are insufficient degrees of freedom, the program reports Warning: Singular matrix and does not proceed further. The most common cause of a singular matrix is the number of cases (rows) in raw data being less than the number of variables selected for analysis.
Predictions: The estimated discriminant functions can be applied to test cases to predict their group membership. If, for a case (row), all independent variables are non-missing, but only the factor (group) variable is missing, then it is treated as a test case. Such cases are not included in the estimation of discriminant functions, but the estimated coefficients are applied to them. The predicted cases are represented in all plots (by an @ character) and in all relevant tables (by an * character).
It is possible to use markers other than missing data to designate cases as test cases. Suppose, for instance, you wish the program to interpret cases with -1 in their group variable as test cases. To do this, enter the following line in the [Options] section of Documents\Unistat10\Unistat10.ini file:
DiscrPredict=-1
If the group variable is a string variable, you can use a string value as a test case marker.