evaluate classification models

In this previous article we described how to construct basic tools such as the “confusion matrix” and Lift/Gain charts to evaluate classification models used for business use cases with predictive analytics. In this article we describe another common evaluation tool – the Receiver Operating Characteristics (ROC) chart and its Area Under Curve (AUC).

confusion matrix
Confusion matrix basic structure

ROC graphs have long been used in signal detection theory to  depict the trade off between hit rates and false alarm rates of classifiers. A hit rate is the ratio of the number of correctly classified targets to the number of classified targets. In other words, this is the ratio of true positives to the total number of hits identified. Recall that this is the definition of sensitivity (see confusion matrix above).

Data for an ROC curve

false alarm rate is the ratio of the number of falsely identified targets to the total number of non-targets (or negatives). This can be expressed as the term (1-specificity) – see table above.

  • Hit rates = sensitivity
  • False alarm rates = 1-specificity

A ROC chart is constructed with 1-Specificity (False Positives rate) on the X-axis and Sensitivity (True Positives rate) on the Y-axis. Thus a ROC curve simply helps one quantify how many true positives are detected by the algorithm for every false positive.

ROC curve
A ROC curve built using data from table above

Clearly a ROC curve which is a straight line at a 45 degree angle indicates that for every false positive detected there is a corresponding true positive that is detected. In other words, the algorithm is no better than a coin toss (a 50-50 chance of getting it right!). Any increase over this “random” performance is considered an improvement. A good classifier therefore flexes above this 45-degree line.

How can we measure this improvement?

From the charts it can be seen that the area under a random ROC curve is 0.5. Thus any AUC above this value is an improvement. Thus the AUC is basically a static measure of the performance of the classifier. Interpreting an AUC of 0.7 for example means that a randomly selected case from the group with the target equals 1 has a score larger than that for a randomly chosen case from the group with the target equals 0 in 70% of the time.

Rapidminer uses AUC and ROC as one of the means of evaluating decision tree performance as was described in this article on using decision trees for several business applications.

Originally posted on Tue, May 03, 2011 @ 11:01 AM

Photo by David Rotimi on Unsplash

No responses yet

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.