Validating LULC classes in QGIS
The problem statement
Any land-use land cover classification needs to be validated with ground-truth data to measure the accuracy. A key single-valued statistic to determine the effectiveness of classification is Cohen’s kappa. This validation metric has been fairly widely used for unbalanced classification as well which expresses a level of agreement between two annotators on a classification problem.
The objective of this quality assessment was to validate the land cover map performed on June, 2020 sentinel-2 imagery by k-means classification algorithm, thus providing a statistical measure of overall class predictions. The validation was done using an independent set of sample points (~500) generated randomly following stratified random sampling design, to capture the variance within the class
After running the tool, the sample points were manually assigned to the ground-truth class. The ground-truth dataset was taken to be Bing-satellite imagery as a proxy for field data. Each sample point was labelled by visual inspection on the ground-truth dataset.
Step 1: Classify Image
- Load raster Image
K-means clustering for gridsunder SAGA tools. Select the raster Image as
gridand in this case we specify 4 classes
At this stage we have unsupervised k-means clustering output ready
Step 2: Convert to polygon (vector format)
Polygonize (Raster to Vector)tool under
- Select the classified image as input. Leave everything else as default. The output would be a
Note the name of the field (
DNhere). This will be used later.
- Fix geometries (this step is important here to avoid any error in further steps)
Step 3: Dissolve the layer on DN field
In this step we dissolve the layer based on the
DN value. This will ensure that each polygon can be evaluated based on the land class type which is needed for stratified random sampling.
Make sure to select dissolve field as
Step 4: Create stratified random samples
Vector->research tools-> Random Points inside Polygon and set
Sampling Strategy =
Points Density and
Point count or density =
Note: The value
1 point for
1/0.001 m2 of area, given that the units is meters.
Step 5: Extract raster values to sample layer
We extract the raster value, which is essentially the land cover class for the classified image. We use
Sample Raster Values function here (
Fig.7). The input layer is the random points we generated earlier and the the raster layer is the classified image. The output adds a new column to the sample points layer with the prediction class of the land-cover (
Step 6: Ground Truth Labelling using Bing maps
At this stage we are ready to validate the image using Bing maps as ground truth. We turn on the edit mode and create new field named Actual class. THen we visually inspect the class on the map and note the land-cover class. Once we inspect all the sample points we can use cohens Kappa statistics to determine the validation result. Alternatively, simply calculating the accuracy would also suffice the need.
Step 7: Add other field to the attribute table with reclassification
We can use the
Field Calculator to generate verbose text for each label in our feature class and display labels for the prediction.
-- in field calculator to increase verbosity CASE WHEN PREDICTED_1 is 2 THEN 'Urban' WHEN PREDICTED_1 is 1 THEN 'Bareland' WHEN PREDICTED_1 is 4 THEN 'Forest' WHEN PREDICTED_1 is 3 THEN 'Urban' END
With this we come to end of the post. Now, validation accuracy can be reported for k-means classification.