Version: 1.0.0 | Published: 18 May 2026 | Updated: 1 day ago
SDDS MOSAIC: Digital Life Cohort
Dataset
Summary
Theme:
- Health and Wellbeing
- Digital
Description:
MOSAIC links donated measures of digital activity with baseline and quarterly follow-up self-report measures of factors relating to wellbeing, mental health, and digital behaviour.
Identifier:
Access Tier:
Controlled
Contact Point:
Documentation
Documentation:
MOSAIC is a UK adult digital life cohort created by the Smart Data Donation Service. Members of the cohort consent to link their digital trace data to repeated follow-up self-report measurement. At Wave 1, the dataset links donated Google YouTube and Google Play histories to participant survey data, with further donation tools coming online in future waves. The secure data release currently includes participant-level digital trace data from YouTube Activity, YouTube Comments, Google Play Store, Google Play Games, Google Play Books, Google Play Purchases, Google Play App Installs and Google Play Subscriptions. These records are linked through a unique Mosaic ID to baseline demographics and digital behaviour measures, and to Digital Diets and Digital Wellbeing follow-up survey tiles.
Recruitment was carried out through Prolific among UK adults aged 18 or over who had a relevant Google account. As a cohort, MOSAIC will onboard new members on a quarterly basis, recruiting up to a target cohort size of 10,000. At Wave 1, 6,933 participants were recruited, 5,437 completed the data donation activity, and 4,374 data donors completed a baseline demographic survey. Participants were then offered the opportunity to complete optional follow-up surveys on digital wellbeing and their digital diet. Follow-up survey participation was 3,781 for Digital Diets and 3,777 for Digital Wellbeing. The donated smart data histories span 2005 to 2026 in the supplied descriptor, with coverage varying by participant and data type.
The dataset is classified as secure. Participant-level records are not publicly released. Access is available only to approved researchers in an SDDS-approved Trusted Research Environment with additional physical access controls, following eligibility, ethics, accreditation, disclosure and DUAC review. The catalogue request link should be used to request an application pack. The pack is intended to include a schema-calibrated synthetic dataset, a full data descriptor, and instructions for the SDDS approval process.
Synthetic Data:
Available as part of the application pack. The synthetic dataset associated with MOSAIC is currently low-fidelity, and is calibrated only to the secure dataset schema. Individual cell values are entirely privacy-preserving and are created using an LLM. This dataset is intended for developing and testing code, workflows and analysis plans before approval. It must not be treated as statistically representative or analytically valid.
Collection:
MOSAIC Digital Life Cohort
Coverage
Spatial
Spatial Coverage:
United Kingdom
Temporal
Start Date:
14 February 2005
End Date:
06 April 2026
Frequency:
QUARTERLY
Date of Latest Release:
18 May 2026
Date of First Release:
18 May 2026
Temporal Aggregation:
- Event-level interaction records
- Quarterly self-report snapshots
Provenance
Origin
Purpose:
OTHER (Data collected for other purpose)
Source:
- Digital Trace Data: Google Data Portability API
- Participants: Prolific
- Survey responses: Qualtrics
Collection Situation:
Collected online via Prolific.com
Collection Status:
Ongoing
Method of Collection:
Participants were recruited through Prolific, completed Qualtrics survey tiles,
and donated digital trace data through the SDDS secure donation platform using
the Google Data Portability API.
Weighting:
No weights are supplied.
Access and Governance
Usage
Data Use Requirements:
- Research specific restrictions
- Not-for-profit, non-commercial use only
- Ethics approval required
- User specific restriction
- Project specific restriction
Access
Access Service:
Researchers should use the SDR UK catalogue request link to request an application pack from the SDDS. The pack will contain the MOSAIC data descriptor, a schema-calibrated synthetic dataset and instructions for the SDDS approval process. \r\n\r\nThe synthetic dataset is calibrated only to the secure dataset schema: it is for developing code, workflows and analysis plans only, not for inference about the MOSAIC population. \r\n\r\nSecure MOSAIC data are not transferred to researchers. After approval, access is provided only through an SDDS-approved Trusted Research Environment with additional physical controls.
Jurisdiction:
Great Britain
Data Controller:
University of York
Licence:
Other (Not Open)
Delivery Lead Time:
Not fixed. Initial applications are reviewed by the SDDS Data Use and Access
Committee (DUAC). Timing depends on eligibility, ethics approval, accreditation,
disclosure review, DUAC decision, TRE onboarding and safe setting arrangements.
Format and Standards
Language:
English
Format:
.rds
Estimated Dataset Size:
More than 173m smart data interaction records, plus linked survey records.
Conforms To:
Local SDDS/MOSAIC schema
Enrichment and Linkage
Linkage Opportunity:
Internal linkage is available across surveys and smart data components using each participant\u0027s unique MOSAIC ID. \r\n\r\nExternal linkage is not available.
Data Distribution
Data Status:
Active
Observations
Name
Population Type
Value
Description
Variable Measured
Unit Code
Observation Date
Activity Store Interactions
39713886
15 May 2026