client logo
Version: V1.0 | Published: 8 May 2026 | Updated: 7 days ago

Cloud probability statistics per small area in 2024 (Version 1.0)

Dataset
SHARE
DATA SERVICE

Summary

Description:
This dataset provides the annual mean cloud probability (%) aggregated to small area geographies across the United Kingdom: Lower layer Super Output Areas (LSOAs) for England and Wales, Data Zones for Scotland, and Small Areas for Northern Ireland. The underlying source data is derived from Sentinel-2 satellite imagery processed at 20-metre spatial resolution from individual scenes covering a year. The aggregation was performed using 2021 boundaries by calculating the weighted mean of all pixels falling within each small area geography using exact pixel-area weighting along with corrections for spatial and temporal artefacts.
Contact Point:

Documentation

Documentation:
The dataset contains the annual cloud probability for each small area (LSOAs in England and Wales, Data Zones in Scotland, Small Areas in Northern Ireland). This data is provided in two distinct formats: a CSV file, which contains the tabular data; and a GPKG file, a geospatial format that combines the tabular data with the boundary geometries.

Coverage

Spatial

Spatial Coverage:
United Kingdom
Geographical Levels:
Small area (LSOA / Data Zone / Small Area)

Temporal

Start Date:
⁠01-01-2024: 31-12-2024
Frequency:
Annual
Date of Latest Release:
08 May 2026
Date of First Release:
23 April 2026

Provenance

Origin

Purpose:
Cloud probability estimates are derived from Sentinel-2 satellite imagery which provides global coverage; this product focuses on the United Kingdom. Further details on the methodology are available in the upcoming technical report. It is important to note that the underlying input data has a spatial resolution of 20-metres, and cloud probability estimates are aggregated to small area boundaries, which vary substantially in geographic area. Users should consult the Sentinel-2 Data Quality Report (https://sentinels.copernicus.eu/documents/247904/685211/Sentinel-2-L1C-Data-Quality-Report-September-2020.pdf) for further information on known quality limitations and uncertainties in the source imagery.
Source:
The underlying methods and source information used to construct the dataset are documented in the upcoming technical report and paper. Cloud probability is derived from Sentinel-2 Level-2A Bottom-of-Atmosphere (BOA) surface reflectance imagery, Collection 1, accessed via Element84 Earth Search STAC API (https://earth-search.aws.element84.com/v1), covering the period 1 January 2024 to 31 December 2024. Data is processed at 20-metre spatial resolution. A composite of individual scenes from the scene classification layer (SCL)(https://custom-scripts.sentinel-hub.com/custom-scripts/sentinel-2/scene-classification/), acquired over the full calendar year was used to generate annual cloud probability estimates . To reduce spatial artifacts arising from urban features and algorithmic misclassification at building boundaries as well as temporal gaps between different satellite orbits, a combination of a reflective surfaces correction and a quantile mapping of high-acquisition into low-acquision pixels was performed. The small area aggregation was performed using exact pixel-area weighting to account for partial pixel-boundary intersections. All processing was conducted by the Imago Team.
Collection Situation:
Sentinel-2
Collection Status:
V1.0

Author 1

Name Organisation:
Imago: Data Service for Imagery
Family Name Person:
Shaonlee Patranabis

Access and Governance

Usage

Data Use Requirements:
None

Access

Access Rights:
CC-BY-4.0
Licence:
CC-BY-4.0

Format and Standards

Estimated Dataset Size:
CSV: 727 KB, GPKG: 135 MB
Vocabulary Encoding Scheme:
EPSG:27700, OSGB36/British National Grid