Yongdong Ouyang, PhD - Which Small-Sample Correction Should Be Used

Journal article

Yongdong Ouyang, M. Taljaard, James P Hughes, Fan Li
2026

Cite

APA Click to copy
Ouyang, Y., Taljaard, M., Hughes, J. P., & Li, F. (2026). Which Small-Sample Correction Should Be Used When Analyzing Stepped-Wedge Designs with Time-Varying Treatment Effects?

Chicago/Turabian Click to copy
Ouyang, Yongdong, M. Taljaard, James P Hughes, and Fan Li. “Which Small-Sample Correction Should Be Used When Analyzing Stepped-Wedge Designs with Time-Varying Treatment Effects?” (2026).

MLA Click to copy
Ouyang, Yongdong, et al. Which Small-Sample Correction Should Be Used When Analyzing Stepped-Wedge Designs with Time-Varying Treatment Effects? 2026.

BibTeX Click to copy

@article{yongdong2026a,
  title = {Which Small-Sample Correction Should Be Used When Analyzing Stepped-Wedge Designs with Time-Varying Treatment Effects?},
  year = {2026},
  author = {Ouyang, Yongdong and Taljaard, M. and Hughes, James P and Li, Fan}
}

Abstract

Stepped-wedge cluster randomized trials (SW-CRTs) evaluate interventions rolled out across clusters over time. Standard analyses typically use immediate-treatment (IT) models, which assume effects begin at crossover and remain constant thereafter. When effects vary with exposure duration, IT models may misrepresent target effects. Exposure-time indicator (ETI) models address this by allowing treatment effects to differ by time since exposure and by targeting the time-averaged treatment effect (TATE) and long-term effect (LTE). Like IT models, ETI models require specification of a random-effects structure, which is often misspecified, and the performance of robust variance estimators (RVEs) in this setting is not well understood. We review RVEs for ETI models and evaluate them in simulation studies with continuous and binary outcomes under correctly specified (binary only) and misspecified random-effects structures. We compare the classic sandwich, Kauermann-Carroll (KC), Mancl-DeRouen (MD), and Morel-Bokossa-Neerchal (MBN) estimators for inference on the TATE and LTE. Our simulations show that under misspecified random-effects structures, model-based standard errors (SE) produced undercoverage, whereas RVEs improved performance. For continuous outcomes, MD with a t-distribution and degrees of freedom equal to the number of clusters minus two gave the most consistent coverage probabilities. For binary outcomes, MBN was the only consistently reliable option. MD, however, could be unstable in one-cluster-per-sequence designs because of data sparsity. Across scenarios, both model-based SE and RVE for LTE were unstable, indicating that greater caution is needed when targeting LTE under ETI models.