Part of the training process consists on running some tests to evaluate the prediction performance of the classifier.

Before going on, make sure you have read the How do Classifiers Work? reference page.

In the Sandox/Tree tab, when you select a category node from the Category Tree you will be able to see the statistics for the selected category.

The stats that are shown depend on the node (whether it’s the root node, an intermediate node or a leaf node), in the case of an intermediate node it will look similar to this:

 

There you can see the following metrics:

  • Accuracy
  • Precision
  • Recall
  • Subtree samples
  • Category samples

Performance metrics and samples counter

The circular graphs show the sample count for this category and three performance metrics: AccuracyPrecision & Recall. These last three values are key to evaluate how this category will perform when classifying new input.

Samples

This value shows the sample count for the selected category samples. As explained above this also include all the samples that belong to children categories.

It’s recommended that for non-leaf categories this value should be above 4, that means, each non-leaf category should have more than 4 samples including children samples (It’s very common that only the leaf categories have samples). If it’s not the case, an alternative evaluation method is used when training, and the category statistics won’t be as precise as if there are more samples.

More samples doesn’t always mean a better classifier but you should definitively try your best to be above the recommended value so you can get a better insight into your future classifier performance and tune it for your needs.

Accuracy

The accuracy is the percentage of samples that were predicted in the correct category. It’s a metric that shows how well a parent category distinguishes between its children. In the previous example, the Sports category has an accuracy of 93% when distinguishing between its 13 children (Athletics, Baseball, Basketball, Boxing, etc).

Tips to improve Accuracy for parent category X:

  • Add more training samples to children categories of category X.
  • Retag samples that might be incorrectly tagged into the children categories (see confusion matrix section below).
  • Sometimes sibling categories could be too ambiguous, if allowed you could merge those categories.

Accuracy on its own is not a good metric, you also have to take care of precision and recall, ie: you can have a classifier with very good accuracy but still have categories with bad precision and recall (see folowing sections).

If the module is configured to be multilabel, a different metric is used: Jaccard similarity coefficient

Precision and Recall

The precision for a non-root category is the percentage of the test samples that were classified to this category by its parent and actually belonged to this category. Since the root has no parent this value doesn’t make sense and won’t be displayed.

The recall for a category is the percentage of all the test samples that originally belonged to this category and in the evaluation process were correctly classified to this category by its parent. Since the root has no parent this value doesn’t make sense and won’t be shown.

Precision and Recall are useful metrics to check the accuracy on each children category.

If a category X has low precision, that means that samples from sibling categories were predicted as X.

If a category X has low recall, that means that samples from category X were predicted as other sibling categories.

Usually there’s a trade-off between precision and recall in a particular category, that means, if you try to increase precision, you could end up doing that at the cost of lowering recall, and vice versa.

Tips to improve Precision for category X:

  • Check the samples that were predicted as X, but they don’t belong to X.
    • If the sample was correctly predicted, move that sample to category X.
    • If the sample was incorrectly predicted as X and belong to category Y, try to make the classifier learn by adding more samples to category X and category Y (you can check this confusion with the confusion matrix, see below).
    • Check that the keywords associated to category X and Y are correct (see Keyword Cloud section to see how to fix that)

Tips to improve Recall for category X:

  • Check the samples that belong to category X but were predicted as other sibling category Y.
    • If the sample was correctly predicted, move that sample to category Y.
    • If the sample was incorrectly predicted as Y and belong to category X, try to make the classifier learn by adding more samples to category X and category Y (you can check this confusion with the confusion matrix, see below).
    • Check that the keywords associated to categories X and Y are correct (see Keyword Cloud section to see how to fix that).