In this post I show how discriminant functions can be extracted from a Linear Discriminant Analysis in Displayr. Such functions are often used in Excel (or elsewhere) to make new predictions based on the LDA. I show how a simple calculation can be used to make new predictions based on the discriminant functions. This post follows on from my earlier description of how to perform Linear Discriminant Analysis in Displayr.
Recap of performing Linear Discriminant Analysis (LDA)
To set up the Linear Discriminant Analysis,
- Import the example data from this URL: "http://wiki.q-researchsoftware.com/images/c/ce/Glass.csv"
- Add the LDA model from the Insert > More > Machine Learning menu
- Select the variables, then press Calculate
Once you've set up your LDAm you can go ahead and generate the discriminant functions. To do so, start by selecting the LDA output, and then go to Insert > Machine Learning > Diagnostic > Discriminant Functions.
This table tells us that the score of an observation for category 1 is -2115766 + 1725406 * "Refractive Index" + 14604 * Na + 13401 * Mg + 17215 * Al + 17122 * Si + 14500 * K + 11306 * Ca + 12896 * Ba + 7517 * Fe.
It would be rather tedious to manually evaluate all 6 functions per data point. So the table can be exported to Excel via Export > Excel or used by another R calculation. In Excel, a matrix of data can be multiplied by the discriminant functions matrix to calculate scores. I am going to perform the same calculation in R to make predictions for the original data.
Manual Calculation of Predictions in R
First I create a table of the data with Insert > More > Tables > Raw Data. I select the 9 outcome variable in the same order as the table above. Then in Insert > R Output type the following few lines of code to make the predictions.
raw.data = cbind(rep(1, nrow(raw.data)), raw.data) raw.data = as.matrix(raw.data) # convert from data.frame to matrix scores = raw.data %*% lda.discriminant.functions predictions = colnames(scores)[apply(scores, 1, which.max)]
The first line prefixes a column of ones to the data which are multiplied by the intercepts in the matrix multiplication. The third line computes the scores for each case in the data for each category of the outcome variable. The final line chooses the category with the highest score from each row. The first few predictions are shown below.
Checking Manual Predictions
Unless you specifically want to use the discriminant functions, there is actually no need to do so in order to make predictions for the training data. The predictions can be extracted from the LDA model directly.
To do this, make copy by clicking on the original LDA model and selecting Home > Copy and then Paste. Move this copy to a new page for clarity, then change the Output to Means. The predictions can then be added to the data tree with Insert > More > Machine Learning > Save Variable(s) > Predicted Values. You can hover over the new variable created to see that the first few cases are the same as the table above. More thoroughly, you could compare the vectors with code.
My work is saved in this this Displayr document. You can replicate the steps or use your own data by clicking the link (just sign into Displayr first).