Yongdong Ouyang, PhD

Assistant Professor, Roswell Park Comprehensive Cancer Center

Data sharing in stepped-wedge cluster randomized trials: suboptimal data availability despite "data available upon request".


Journal article


Cory E. Goldstein, A. Armond, K. Cobey, E. Voldal, Yutong Chen, Kylie Tingley, Julia F. Shaw, P. Heagerty, K. Hemming, Avi Kenny, Fan Li, Yongdong Ouyang, Fan Xia, David Moher, James P. Hughes, M. Taljaard
Journal of Clinical Epidemiology, 2026

Semantic Scholar DOI PubMed
Cite

Cite

APA   Click to copy
Goldstein, C. E., Armond, A., Cobey, K., Voldal, E., Chen, Y., Tingley, K., … Taljaard, M. (2026). Data sharing in stepped-wedge cluster randomized trials: suboptimal data availability despite "data available upon request". Journal of Clinical Epidemiology.


Chicago/Turabian   Click to copy
Goldstein, Cory E., A. Armond, K. Cobey, E. Voldal, Yutong Chen, Kylie Tingley, Julia F. Shaw, et al. “Data Sharing in Stepped-Wedge Cluster Randomized Trials: Suboptimal Data Availability despite &Quot;Data Available upon Request&Quot;.” Journal of Clinical Epidemiology (2026).


MLA   Click to copy
Goldstein, Cory E., et al. “Data Sharing in Stepped-Wedge Cluster Randomized Trials: Suboptimal Data Availability despite &Quot;Data Available upon Request&Quot;.” Journal of Clinical Epidemiology, 2026.


BibTeX   Click to copy

@article{cory2026a,
  title = {Data sharing in stepped-wedge cluster randomized trials: suboptimal data availability despite "data available upon request".},
  year = {2026},
  journal = {Journal of Clinical Epidemiology},
  author = {Goldstein, Cory E. and Armond, A. and Cobey, K. and Voldal, E. and Chen, Yutong and Tingley, Kylie and Shaw, Julia F. and Heagerty, P. and Hemming, K. and Kenny, Avi and Li, Fan and Ouyang, Yongdong and Xia, Fan and Moher, David and Hughes, James P. and Taljaard, M.}
}

Abstract

BACKGROUND Data sharing enhances transparency, facilitates reproducibility, and promotes innovation in health research. For statisticians, access to data from real trials is essential to develop, validate, and refine statistical methods.

OBJECTIVES Within a collection of published stepped-wedge cluster randomized trials (SW-CRTs), we aimed to describe the prevalence and types of data sharing statements; the actual availability of data after emailing authors; and factors associated with data obtainment.

METHODS We identified SW-CRTs published between 2016-2022 from a previous systematic review and updated that search to include studies published through 31 December 2023. Data sharing statements, when provided, were classified as indicating data were publicly available, available upon request, or not available. Authors were emailed to request datasets. Associations between trial characteristics and data obtainment were explored using bivariable logistic regression and results reported as Odds Ratio (OR) with 95% Confidence Interval (CI).

RESULTS Of 217 SW-CRTs identified, 98 (45%) had no clear data sharing statements, 89 (41%) indicated data were available upon request, 16 (7%) indicated data were not available, and 14 (7%) indicated data were publicly available. Datasets were ultimately obtained for 76 (35%) SW-CRTs. Data obtainment did not differ between studies with no data sharing statement and those indicating data were available upon request (both 34%). The odds of data obtainment were significantly higher among trials conducted in low- and middle-income countries (OR=2.9, 95% CI 1.5-5.4). The odds of data obtainment increased with years since publication (OR=1.13; 95% CI 0.99-1.29) and years since trial initiation (OR=1.11; 95% CI 1.00-1.23) although confidence intervals overlapped with the null. There was no clear evidence of an association with having positive primary trial results (OR=0.62; 95% CI 0.35-1.10), nor with journal impact factor, trial size, type of design, region of corresponding author, and funding source.

CONCLUSION Data sharing practices in SW-CRTs are suboptimal. The presence of a data sharing statement is not predictive of actual data availability. There is significant regional variation in whether data were obtained but few other characteristics explain variation in data obtainment. Clear guidance and dedicated resources to facilitate data sharing in research are required.