Validating LULC classes in QGIS
The problem statement
Any land-use land cover classification needs to be validated with ground-truth data to measure the accuracy. A key single-valued statistic to determine the effectiveness of classification is Cohen’s kappa. This validation metric has been fairly widely used for unbalanced classification as well which expresses a level of agreement between two annotators on a classification problem.
The objective of this quality assessment was to validate the land cover map performed on June, 2020 sentinel-2 imagery by k-means classification algorithm, thus providing a statistical measure of overall class predictions. The validation was done using an independent set of sample points (~500) generated randomly following stratified random sampling design, to capture the variance within the class
After running the tool, the sample points were manually assigned to the ground-truth class. The ground-truth dataset was taken to be Bing-satellite imagery as a proxy for field data. Each sample point was labelled by visual inspection on the ground-truth dataset.
Step 1: Classify Image
- Load raster Image
- Open
K-means clustering for grids
under SAGA tools. Select the raster Image asgrid
and in this case we specify 4 classes
- Click
Run
At this stage we have unsupervised k-means clustering output ready
(Fig.2)
.
Step 2: Convert to polygon (vector format)
- Select
Polygonize (Raster to Vector)
tool underGDAL
->Raster Conversion
- Select the classified image as input. Leave everything else as default. The output would be a
Vectorised
scratch layer.
Note the name of the field (
DN
here). This will be used later.
- Fix geometries (this step is important here to avoid any error in further steps)
Vector Geometry
->Fix Geometry
Step 3: Dissolve the layer on DN field
In this step we dissolve the layer based on the DN
value. This will ensure that each polygon can be evaluated based on the land class type which is needed for stratified random sampling.
Make sure to select dissolve field as
DN
Step 4: Create stratified random samples
Go to Vector->research tools-> Random Points inside Polygon
and set Sampling Strategy
= Points Density
and Point count or density
= 0.001
.
Note: The value 0.001
signify 1
point for 1/0.001
m2 of area, given that the units is meters.
Step 5: Extract raster values to sample layer
We extract the raster value, which is essentially the land cover class for the classified image. We use Sample Raster Values
function here (Fig.7
). The input layer is the random points we generated earlier and the the raster layer is the classified image. The output adds a new column to the sample points layer with the prediction class of the land-cover (Fig.8
).
Step 6: Ground Truth Labelling using Bing maps
At this stage we are ready to validate the image using Bing maps as ground truth. We turn on the edit mode and create new field named Actual class. THen we visually inspect the class on the map and note the land-cover class. Once we inspect all the sample points we can use cohens Kappa statistics to determine the validation result. Alternatively, simply calculating the accuracy would also suffice the need.
Step 7: Add other field to the attribute table with reclassification
We can use the Field Calculator
to generate verbose text for each label in our feature class and display labels for the prediction.
-- in field calculator to increase verbosity
CASE WHEN PREDICTED_1 is 2 THEN 'Urban'
WHEN PREDICTED_1 is 1 THEN 'Bareland'
WHEN PREDICTED_1 is 4 THEN 'Forest'
WHEN PREDICTED_1 is 3 THEN 'Urban'
END
With this we come to end of the post. Now, validation accuracy can be reported for k-means classification.