Version: V1.0 | Published: 8 May 2026 | Updated: 7 days ago
Cloud probability statistics per small area in 2024 (Version 1.0)
Dataset
Summary
Description:
This dataset provides the annual mean cloud probability (%) aggregated to small area geographies across the United Kingdom: Lower layer Super Output Areas (LSOAs) for England and Wales, Data Zones for Scotland, and Small Areas for Northern Ireland. The underlying source data is derived from Sentinel-2 satellite imagery processed at 20-metre spatial resolution from individual scenes covering a year. The aggregation was performed using 2021 boundaries by calculating the weighted mean of all pixels falling within each small area geography using exact pixel-area weighting along with corrections for spatial and temporal artefacts.
Contact Point:
Documentation
Documentation:
The dataset contains the annual cloud probability for each small area (LSOAs in England and Wales, Data Zones in Scotland, Small Areas in Northern Ireland). This data is provided in two distinct formats: a CSV file, which contains the tabular data; and a GPKG file, a geospatial format that combines the tabular data with the boundary geometries.
Coverage
Spatial
Spatial Coverage:
United Kingdom
Geographical Levels:
Small area (LSOA / Data Zone / Small Area)
Temporal
Start Date:
01-01-2024: 31-12-2024
Frequency:
Annual
Date of Latest Release:
08 May 2026
Date of First Release:
23 April 2026
Provenance
Origin
Purpose:
Cloud probability estimates are derived from Sentinel-2 satellite imagery which
provides global coverage; this product focuses on the United Kingdom. Further
details on the methodology are available in the upcoming technical report. It is
important to note that the underlying input data has a spatial resolution of
20-metres, and cloud probability estimates are aggregated to small area
boundaries, which vary substantially in geographic area. Users should consult
the Sentinel-2 Data Quality Report
(https://sentinels.copernicus.eu/documents/247904/685211/Sentinel-2-L1C-Data-Quality-Report-September-2020.pdf)
for further information on known quality limitations and uncertainties in the
source imagery.
Source:
The underlying methods and source information used to construct the dataset are documented in the upcoming technical report and paper. Cloud probability is derived from Sentinel-2 Level-2A Bottom-of-Atmosphere (BOA) surface reflectance imagery, Collection 1, accessed via Element84 Earth Search STAC API (https://earth-search.aws.element84.com/v1), covering the period 1 January 2024 to 31 December 2024. Data is processed at 20-metre spatial resolution. A composite of individual scenes from the scene classification layer (SCL)(https://custom-scripts.sentinel-hub.com/custom-scripts/sentinel-2/scene-classification/), acquired over the full calendar year was used to generate annual cloud probability estimates . To reduce spatial artifacts arising from urban features and algorithmic misclassification at building boundaries as well as temporal gaps between different satellite orbits, a combination of a reflective surfaces correction and a quantile mapping of high-acquisition into low-acquision pixels was performed. The small area aggregation was performed using exact pixel-area weighting to account for partial pixel-boundary intersections. All processing was conducted by the Imago Team.
Collection Situation:
Sentinel-2
Collection Status:
V1.0
Author 1
Name Organisation:
Imago: Data Service for Imagery
Family Name Person:
Shaonlee Patranabis
Access and Governance
Usage
Data Use Requirements:
None
Access
Access Rights:
CC-BY-4.0
Licence:
CC-BY-4.0
Format and Standards
Estimated Dataset Size:
CSV: 727 KB, GPKG: 135 MB
Vocabulary Encoding Scheme:
EPSG:27700, OSGB36/British National Grid