Journal articles
* Student or intern collaborator; + corresponding author
- J. Chu*, W. Lu, and S. Yang+ (2023). Targeted optimal treatment regime learning using summary statistics. Biometrika, doi:10.1093/biomet/asad020. [arxiv]
- C. Gao*, S. Yang+, and J. K. Kim (2023). Soft calibration for correcting selection bias under mixed-effects models. Biometrika, doi:10.1093/biomet/asad016. [arxiv]
- E. Cho* and S. Yang (2023). Variable selection for doubly robust causal inference. Statistics and Its Interface, accepted. [arxiv]
- S. Liu*, S. Yang+, Y. Zhang, and G. Liu (2023). Sensitivity analysis in longitudinal clinical trials via distributional imputation. Statistical Methods in Medical Research, 32, 181–194. [link] [arxiv]
- S. Yang and Y. Zhang (2023). Multiply robust matching estimators of average and quantile treatment effects. Scandinavian Journal of Statistics, 50, 235–265. [arxiv]
- S. Yang, C. Gao*, X. Wang, and D. Zeng (2022). Elastic integrative analysis of randomized trial and real-world data for treatment heterogeneity estimation. Journal of the Royal Statistical Society: Series B, accepted. [arxiv]
- Y. Guan, G. L. Page, B. J. Reich, M. Ventrucci, and Shu Yang (2022). A spectral adjustment for spatial confounding. Biometrika, accepted. [arxiv]
- Q. Guan* and S. Yang+ (2022). A unified framework for causal inference with multiple imputation using martingales. Statistica Sinica, doi:10.5705/ss.202021.0404. [arxiv]
- D. Lee*, S. Yang+, and X. Wang (2022). Doubly robust estimators for generalizing treatment effects on survival outcomes from randomized controlled trials to a target population. Journal of Causal Inference, 10, 415-440. [arxiv]
- S. Yang and X. Wang (2022). RWD-integrated randomized clinical trial analysis. 2022 ASA Biopharmaceutical Report Real World Evidence (Editors: Herbert Pang, Ling Wang, Kristi L. Griffiths), 29, 15–21.
- X. Mao, Z. Wang, and S. Yang (2022). Matrix completion under complex survey sampling. Annals of the Institute of Statistical Mathematics, doi:10.1007/s10463-022-00851-5. [arxiv]
- M. Yu*, W. Lu, S. Yang, and P. Ghosh (2022). Multiplicative structural nested mean model for zero-inflated outcomes. Biometrika, doi.org/10.1093/biomet/asac050.
- D. Kong, S. Yang, and L. Wang (2022). Identifiability of causal effects with multiple causes and a binary outcome. Biometrika, 109, 265–272. doi:10.1093/biomet/asab016. [arxiv]
- Z. Jiang, S. Yang, and P. Ding (2022). Multiply robust estimation of causal effects under principal ignorability. Journal of the Royal Statistical Society: Series B, doi:10.1111/rssb.12538. [arxiv]
- C. Gao*, K. J. Thompson, S. Yang and J. K. Kim (2022). Nearest neighbor ratio imputation with incomplete multinomial outcome in survey sampling. Journal of the Royal Statistical Society: Series A, 185, 1903-1930.
- J.Y. Wang, R. Wong, S. Yang, and G. Chan (2022). Estimation of Partially Conditional Average Treatment Effect by Hybrid Kernel-covariate Balancing. Electronic Journal of Statistics, doi.org/10.1214/22-EJS2000. [arxiv]
- A. B. Giffin*, B. J. Reich , S. Yang, and A. Rappold (2022). Generalized propensity score approach to causal inference with spatial interference. Biometrics, doi: 10.1111/biom.13745. [arxiv]
** Winner of the 2021 ENAR Distinguished Student Paper Competition Award
- A. B. Giffin*, W. Gong, S. Majumder, A. Rappold, B. J. Reich, and S. Yang (2022). Estimating intervention effects on infectious disease control: the effect of community mobility reduction on Coronavirus spread. Spatial Statistics, 52, 100711. [arxiv]
- H. Zhao*, X. Zhang and S. Yang (2022). Double score matching in observational studies with multi-level treatments. Communications in Statistics – Simulation and Computation, doi.org/10.1080/03610918.2022.2118778.
- H. Zhao* and S. Yang (2022). Outcome-adjusted balance measure for generalized propensity score model selection. Journal of Statistical Planning and Inference, 221, 188–200. [arxiv]
** Winner of the 2021 DISS Best Poster Award
- D. Johnson*, K. Pieper, and S. Yang+ (2022). Treatment-specific Marginal Structural Cox Model for the Effect of Treatment Discontinuation. Pharmaceutical Statistics, 21, 988-1004.
- J. W. Yu, D. Bandyopadhyay, S. Yang, L. Kang, and G. Gupta (2022). Propensity score modeling in electronic health records with time-to-event endpoints: application to kidney transplantation. Journal of Data Science, 20, 188–208.
- M.Y. Huang and S. Yang+ (2022). Robust inference of conditional average treatment effects using dimension reduction. Statistica Sinica, 32, 547-567. [arxiv]
- A. Larsen*, S. Yang, A. Rappold, and B. Reich (2022). A spatial causal analysis of wildland fire-contributed PM2.5 using numerical model output. Annuals of Applied Statistics, 16, 2714-2731. [arxiv]
- L. Wu*, and S. Yang (2022). Transfer learning of individualized treatment rules from experimental to real-world data. Journal of Computation and Graphical Statistics, doi.org/10.1080/10618600.2022.2141752. [arxiv]
- L. Wu* and S. Yang+ (2022). Integrative R-learner of heterogeneous treatment effects combining experimental and observational studies. Proceedings of the 1st Conference on Causal Learning and Reasoning, PMLR, 140, 1–S5.
- B. J. Reich, S. Yang, and Y. Guan (2022). Discussion on “Spatial+: a novel approach to spatial confounding” by Dupont, Wood and Augustin, Biometrics, doi.org/10.1111/biom.13651.
- N. Corder* and S. Yang+ (2022). Utilizing stratified generalized propensity score matching to approximate blocked trial designs with multiple treatment levels. Journal of Biopharmaceutical Statistics, doi:10.1080/10543406.2022.2065507.
- Y. Zhang*, S. Yang, W. Ye, Douglas E. Faries, I. Lipkovich, Z. Kadziola (2021). Best practices of double score matching for estimating causal effects, Statistics in Medicine, 42, 1421–1445.
- D. Lee*, S. Yang+, L. Dong, X. Wang, D. Zeng, J.W. Cai (2021). Improving trial generalizability using observational studies, Biometrics, doi:10.1111/biom.13609. [arxiv]
** Winner of the 2020 ENAR Distinguished Student Paper Competition Award
- S. Yang (2021). Semiparametric efficient estimation of structural nested mean models with irregularly spaced observations. Biometrics, doi.org/10.1111/biom.13471. [arxiv]
- B. J. Reich, S. Yang, Y. Guan, A. B. Giffin, M. J. Miller and A. G. Rappold (2021). A review of spatial causal inference methods for environmental and epidemiological applications. International Statistical Review, 89, 605-634. [arxiv]
- S. Yang, J. K. Kim, and Youngdeok Hwang (2021). Integration of data from probability surveys and big found data for finite population inference using mass imputation. Survey Methodology, 47, 29–58.
- F. Cools, D. Johnson, A. J. Camm, J. P. Bassand, F. Verheugt, S. Yang, A. Tsiatis, D. A. Fitzmaurice, S. Z. Goldhaber, G. Kayani, S. Goto, S. Haas, F. Misselwitz, A. Turpie, K. Fox, K. Pieper, A. K. Kakkar (2021). Risks associated with discontinuation of oral anticoagulation in newly diagnosed patients with atrial fibrillation: results from the GARFIELD-AR Registry. Journal of Thrombosis and Hemostasis, doi:10.1111/jth.15415. (Collaboration work)
- S. Yang, Y. Zhang, G. Liu, and Q. Guan (2021). SMIM: a unified framework of Survival sensitivity analysis using Multiple Imputation and Martingale. Biometrics, 10.1111/biom.13555. [arxiv]
- S. Yang, J. K. Kim, and R. Song (2020). Doubly robust inference when combining probability and non-probability samples with high-dimensional data, Journal of the Royal Statistical Society: Series B, 82, 445–465.
- S. Yang, K. Pieper, and F. Cools (2020). Semiparametric estimation of structural failure time model in continuous-time processes, Biometrika, 107, 123-136.
- N. Corder* and S. Yang (2020). Estimating average treatment effects utilizing fractional imputation when confounders are subject to missingness, Journal of Causal Inference, 8, 249-271.
- L. Dong*, E. Laber, Y. Goldberg, R. Song, S. Yang (2020). Ascertaining properties of weighting in the estimation of optimal treatment regimes under monotone missingness, Statistics in Medicine, doi: 10.1002/sim.8678.
- S. Yang and P. Ding (2020). Combining multiple observational data sources to estimate causal effects, Journal of American Statistical Association, 115, 1540–1554.
- S. Yang and J. K. Kim (2020). Statistical data integration in survey sampling: a review, Japanese Journal of Statistics and Data Science, 3, 625–650.
- W. Li*, S. Yang,+ and P. Han (2020). Robust estimation for moment condition models with data missing not at random, Journal of Statistical Planning and Inference, 207, 246–254.
- S. Yang and J. K. Kim (2020). Asymptotic theory and inference of predictive mean matching imputation using a superpopulation model framework. Scandinavian Journal of Statistics, 47, 839–861.
- S. Chen, S. Yang, and J.K. Kim (2020). Nonparametric mass imputation for data integration. Journal of Survey Statistics and Methodology, doi.org/10.1093/jssam/smaa036.
- S. Yang (2019). Book reviews: Flexible imputation of missing data, 2nd ed. Journal of American Statistical Association, 114, 1421–1421.
- S. Yang, L. Wang, and P. Ding (2019). Causal inference with confounders missing not at random, Biometrika, 106, 875–888.
- S. Yang and D. Zeng (2018). Discussion on penalized spline of propensity methods for treatment comparison by Zhou, Elliott and Little, Journal of American Statistical Association, 114, 30–32.
- S. Yang and J. J. Lok (2018). Sensitivity analysis for unmeasured confounding in coarse structural nested mean models, Statistica Sinica, 28, 1703–1723.
- S. Yang (2018). Propensity score weighting for causal inference with clustered data, Journal of Causal Inference, doi.org/10.1515/jci-2017-0027.
- S. Yang and J. K. Kim (2018). Nearest neighbor imputation for general parameter estimation in survey sampling, Advances in Econometrics, 39, 211–236.
- S. Yang and P. Ding (2018). Asymptotic inference of causal effects with observational studies trimmed by the estimated propensity scores, Biometrika, 105, 487–493. [arixv]
- Z. Wang, J. K. Kim, and S. Yang (2018). An approximate Bayesian inference under informative sampling, Biometrika, 105, 91–102.
- J. Lok, S. Yang, B.Sharkey, Hughes, M (2018). Estimation of the cumulative incidence function under multiple dependent and independent censoring mechanisms, Lifetime Data Analysis, 24, 201–223.
- S. Yang, A. A. Tsiatis, and M. Blazing (2018). Modeling survival distribution as a function of time to treatment discontinuation: a dynamic treatment regime approach, Biometrics, 74, 900–909.
- S. Yang and J. K. Kim (2017). A semiparametric inference to regression analysis with missing covariates in survey data, Statistica Sinica, 27, 261–285.
- J. K. Kim and S. Yang (2017). A note on multiple imputation under complex sampling, Biometrika, 104, 221–228.
- S. Yang and J. K. Kim (2017). Discussion: dissecting multiple imputation from a multi-phase inference perspective: what happens when god’s, imputer’s and analyst’s models are uncongenial? by X. Xie and X. L. Meng, Statistica Sinica, 27, 1568–1573.
- S. Yang, and J. J. Lok (2016). A goodness-of-fit test for structural nested mean models, Biometrika, 103, 734–741.
- S. Yang, and J. K. Kim (2016). Fractional imputation in survey sampling: a comparative review, Statistical Science, 31, 415–432.
- S. Yang, G. Imbens, Z. Cui, D. Faries and Z. Kadziola (2016), Propensity score matching and stratification in observational studies with multi-level treatments, Biometrics, 72, 1055–1065. With R package available “multilevelMatching“.
- S. Yang and J. K. Kim (2016). A note on multiple imputation for method of moments estimation, Biometrika, 103, 244–251.
- S. Yang and J. K. Kim (2015). Likelihood-based inference with missing data under missing-at-random, Scandinavian Journal of Statistics, 43, 436–454.
** Winner of the 2014 JSM Student Paper Competition Award
- L. Peyer, G. Welk, L. B. Davis, S. Yang, and J. K. Kim (2015). Factors associated with parent concern for child weight and parenting behaviors, Childhood Obesity, 11, 269–274. (Collaboration work)
- S. Yang and Z. Zhu (2015). Variance estimation and kriging prediction for a class of non-stationary spatial models, Statistica Sinica,25, 135–149.
- J. K. Kim and S. Yang (2014). Fractional hot deck imputation for robust estimation under item nonresponse in survey sampling, Survey Methodology, 40, 211–230.
- J. K. Kim, Z. Zhu, and S. Yang (2013). Improved estimation for June Area Survey incorporating several information, Proceedings 59th ISI World Statistics Congress, Hong Kong, China, 199–204.
- S. Yang, J. K. Kim and D. W. Shin (2013). Imputation methods for quantile estimation under missing at random, Statistics and Its Interface, 6, 369–377.
- S. Yang, J. K. Kim and Z. Zhu (2013). Parametric fractional imputation for mixed models with nonignorable missing data, Statistics and Its Interface, 6, 339–347.
Technical Reports
- S. Yang, D. Zeng, X. Wang. Improved Inference for Heterogeneous Treatment Effects Using Real-World Data Subject to Hidden Confounding. [arxiv]
- B. Colnet, I. Mayer, G. Chen, A. Dieng, R. Li, G. Varoquaux, J.P. Vert, J. Josse+, S. Yang+. Causal inference methods for combining randomized trials and observational studies: a review. [arxiv]
- P. Sang, D. Kong, and S. Yang+. Functional principal component analysis for longitudinal observations with sampling at random. [arxiv]
- A. B. Giffin*, B. J. Reich, S. Yang, and A. Rappold. Instrumental variables, spatial confounding and interference. [arxiv]
- L. Wu*, S. Yang, B. J. Reich, and A. Rappold. Estimating spatially varying health effects in app-based citizen science research. [arxiv]
** Winner of the 2021 ASA Section on Statistics in Epidemiology Young Investigator Award
** Winner of the IMB Student Research Award from the 34th New England Statistics Symposium
- S. Tang*, S. Yang+, T. Wang, Z. Cui. L. Li, D. Faries. Causal inference of hazard ratio based on propensity score matching. [arxiv]
** Winner of the 2021 ENAR Distinguished Student Paper Competition Award
- S. Yang and Z. Zhu. Semiparametric estimation of spectral density and variogram with irregular observations. [arxiv]
- C. Gao* and S. Yang. Pretest estimation in combining probability and non-probability samples.
- C. Gao*, S. Yang, and A. Zhang. Self-supervised Single Image Denoising via Low-rank Tensor Approximated Convolutional Neural Network. [arxiv]
- S. Liu*, S. Yang+, Y. Zhang, and G. Liu. Multiply robust estimators in longitudinal studies with missing data under control-based imputation. [arxiv]
- S. Liu*, Y. Zhang, G. T. Golm, G. Liu, and S. Yang+.Robust analyses for longitudinal clinical trials with dropouts and non-normal continuous outcomes. [arxiv]
- S. Xu*, S. Yang, B. J. Reich. A Bayesian non-parametric method for estimating causal quantile effects.
- D. Lee*, S. Ghosh, and S. Yang+. Transporting survival of an HIV clinical trial to the external target populations. [arxiv]
- X. Tan*, S. Yang, W. Ye, D. E. Faries, I. Lipkovich, Z. Kadziola. When doubly robust methods meet machine learning for estimating treatment effects from real-world data: a comparative study. [arxiv]
- B. Smith, S. Yang, A. Apter, and D. Scharfstein. Trials with irregular and informative assessment times: a sensitivity analysis approach. [arxiv]
- Y. Zhang, D. Kong, and S. Yang+. Towards R-learner of conditional average treatment effects with a continuous treatment: T-identification, estimation, and inference. [arxiv]
- C. Cui, S. Yang, B. J. Reich, and D. Gill. Matching estimators of causal effects in clustered observational studies with application to quantifying the impact of marine protected areas on biodiversity. [arxiv]
- P. Zhao, J. Josse, and S. Yang. Efficient and robust transfer learning of optimal individualized treatment regimes with right-censored survival data. [arxiv]
** Winner of the 2023 ASA Section on Nonparametric Statistics Student Paper Award
Thesis
- S. Yang (2014). Fractional imputation method of handling missing data and spatial statistics. Iowa State University. [Link]