Evaluating the accuracy of a classification

    The basic idea is to compare the predicted classification (supervised or unsupervised) of each pixel with the actual classification as discovered by ground truth. A good review of methods is given by (Congalton, 1991).

    Four kinds of accuracy information:

    1. Nature of the errors: what kinds of information are confused?

    2. Frequency of the errors: how often do they occur?

    3. Magnitude of errors: how bad are they? E.g., confusing old-growth with second-growth forest is not as ‘bad’ an error as confusing water with forest.

    4. Source of errors: why did the error occur?

  • The Confusion Matrix

        The analyst selects a sample of pixels and then visits the sites (or vice-versa), and builds a confusion matrix: (IDRISI module CONFUSE.). This is used to determine the nature and frequency of errors.

        columns = ground data (assumed ‘correct’)

        rows = map data (classified by the automatic procedure)

        cells of the matrix = count of the number of observations for each (ground, map) combination

        diagonal elements = agreement between ground and map; ideal is a matrix with all zero off-diagonals

        errors of omission (map producer’s accuracy) = incorrect in column / total in column. Measures how well the map maker was able to represent the ground features.

        errors of commission (map user’s accuracy) = incorrect in row / total in row. Measures how likely the map user is to encounter correct information while using the map.

        Overall map accuracy = total on diagonal / grand total

        Statistical test of the classification accuracy for the whole map or individual cells is possible using the kappa index of agreement. This is like a c ? test except that it accounts for chance agreement.


back.jpg (5747 bytes)

Back to Home

Next