Version: 0.1.0 | Published: 10 Dec 2025 | Updated: 96 days ago
Summary
Description:
This dataset provides the annual mean cloud probability (%) aggregated to small area geographies across the United Kingdom: Lower layer Super Output Areas (LSOAs) for England and Wales, Data Zones for Scotland, and Small Areas for Northern Ireland. The underlying source data is derived from Sentinel-2 satellite imagery processed at 20-metre spatial resolution from individual scenes covering a year. The aggregation was performed using 2021 boundaries by calculating the weighted mean of all pixels falling within each small area geography using exact pixel-area weighting along with corrections for spatial and temporal artefacts.
**IMPORTANT** - This is a beta version of the data product. We have released it as a preview to allow users exploration and experimentation while we perform final quality assurance and additional checks. If you encounter any issue or have questions, please get in touch with us at [imago@liverpool.ac.uk](mailto:imago@liverpool.ac.uk) We will shortly be releasing a more stable version. Please check [https://imago.ac.uk](https://imago.ac.uk) for news and updates.
Contact Point:
Documentation
Documentation:
The dataset contains the annual cloud probability for each small area (LSOAs in England and Wales, Data Zones in Scotland, Small Areas in Northern Ireland). This data is provided in two distinct formats: a CSV file, which contains the tabular data; and a GPKG file, a geospatial format that combines the tabular data with the boundary geometries.
Coverage
Spatial
Spatial Coverage:
United Kingdom
Geographical Levels:
Small area (LSOA / Data Zone / Small Area)
Temporal
Start Date:
01-01-2024:31-12-2024
Frequency:
annual
Date of Latest Release:
10 December 2025
Date of First Release:
10 December 2025
Provenance
Origin
Purpose:
Cloud probability estimates are derived from Sentinel-2 satellite imagery which
provides global coverage; this product focuses on the United Kingdom. Further
details on the methodology are available in the upcoming technical report. It is
important to note that the underlying input data has a spatial resolution of
20-metres, and cloud probability estimates are aggregated to small area
boundaries, which vary substantially in geographic area. Users should consult
the Sentinel-2 Data Quality Report
(https://sentinels.copernicus.eu/documents/247904/685211/Sentinel-2-L1C-Data-Quality-Report-September-2020.pdf)
for further information on known quality limitations and uncertainties in the
source imagery. **IMPORTANT** - This version is released as beta. This means
that, although we have applied our quality standard in its production, we are
still assessing its overall quality and identifying minor issues that need
resolution before a more stable version can be released. As soon as we are
ready, we will issue a new release. Please check https://imago.ac.uk for news
and updates.
Source:
The underlying methods and source information used to construct the dataset are documented in the upcoming technical report. Cloud probability is derived from Sentinel-2 Level-2A Bottom-of-Atmosphere (BOA) surface reflectance imagery, Collection 1, accessed via Element84 Earth Search STAC API (https://earth-search.aws.element84.com/v1), covering the period 1 January 2024 to 31 December 2024. Data is processed at 20-metre spatial resolution. A composite of individual scenes from the scene classification layer (SCL)(https://custom-scripts.sentinel-hub.com/custom-scripts/sentinel-2/scene-classification/), acquired over the full calendar year was used to generate annual cloud probability estimates . To reduce spatial artifacts arising from urban features and algorithmic misclassification at building boundaries as well as temporal gaps between different satellite orbites, a spatial smoothing approach was applied using acquisition-weighted neighbourhood convolution (21×21 pixel kernel) followed by adjustment using linear regression residuals. The small area aggregation was performed using exact pixel-area weighting to account for partial pixel-boundary intersections. All processing was conducted by the Imago Team.
Collection Status:
0.1.0
Author 1
Name Organisation:
Imago: Data Service for Imagery
Family Name Person:
Shaonlee Patranabis
Access and Governance
Usage
Data Use Requirements:
None
Access
Access Rights:
CC-BY-NC-4.0
Licence:
CC-BY-NC-4.0
Format and Standards
Vocabulary Encoding Scheme:
EPSG:27700, OSGB36/British National Grid