Journal articles
* Student or intern collaborator; + corresponding author
- D. Faries, C. Gao, X. Zhang, C. Hazlett, J. Stamey, S. Yang, et al (2024). Real effect or bias? Best practices for evaluating the robustness of real-world evidence through quantitative sensitivity analysis for unmeasured confounding, Pharmaceutical Statistics, doi:10.1002/pst.2457. [Authorea] [arxiv]
- S. Yang, S. Liu, D. Zeng, and X. Wang (2024). Data fusion methods for the heterogeneity of treatment effect and confounding function. Bernoulli [arxiv]
- C. Gao*, S. Yang+, M. Shan, W. Ye, I. Lipkovich, and D. Faries (2024). Improving randomized controlled trial analysis with data-adaptive borrowing. Biometrika [arxiv] [code]
- J. Chu*, S. Yang , W. Lu, and P. Ghosh (2024). A shadow variable approach to policy evaluation and learning with one-sided feedback. NeurIPS.
** Selected as an oral presentation at the NeurIPS 2024 CRL NeurIPS workshop
- P. Sang, D. Kong, and S. Yang+ (2024). Functional principal component analysis for longitudinal observations with sampling at random. Biometrika, doi:10.1093/biomet/asae055. [arxiv]
- B. Smith, Y. Gao, S. Yang, A. Apter, and D. Scharfstein (2024). Semiparametric sensitivity analysis for trials with irregular and informative assessment times. Biometrics [arxiv]
- H. Zhao*, T. Wang, S. Yang, Z. Cui, I. Lipkovich, and D. Fairs (2024). Propensity score matching for estimation of pairwise marginal hazard ratios. Communications in Statistics – Theory and Methods, doi.org/10.1080/03610926.2024.2419897.
- L. Wu*, C. Gao*, S. Yang+, B. J. Reich, and A. Rappold (2024). Estimating spatially varying health effects in app-based citizen science research. Journal of the Royal Statistical Society: Series C, doi.org/10.1093/jrsssc/qlae034 [arxiv]
** Winner of the 2021 ASA Section on Statistics in Epidemiology Young Investigator Award
** Winner of the IMB Student Research Award from the 34th New England Statistics Symposium
- J. Coulombe and S. Yang (2024). Multiply robust estimation of marginal structural models in observational studies subject to covariate-driven observations. Biometrics, 80, ujae065. [arxiv]
- C. Gao*, Z. Zhang, and S. Yang+ (2024). Causal Customer Churn Analysis with Low-rank Tensor Block Hazard Model. ICML [arxiv] [code]
- C. Gao*, S. Yang, and A. Zhang. Enhancing convolution neural network generalizability via low-rank weight approximation. IET Image Processing, 10.1049/ipr2.13205. [arxiv]
- Y. Cheng* and S. Yang (2024). Inference for Optimal Linear Treatment Regimes in Personalized Decision-making. UAI (40th Conference on Uncertainty in Artificial Intelligence) [arxiv]
** Selected as an oral presentation
- T. Wang*, H. Zhao*, S. Yang+, S. Tang, Z. Cui. L. Li, D. Faries (2024). Propensity score matching for estimating a marginal hazard ratio. Statistics in Medicine, doi:10.1002/sim.10103. [arxiv] [code]
** Winner of the 2021 ENAR Distinguished Student Paper Competition Award
-
S. Fairfax* and S. Yang (2024). Distributional imputation for the analysis of censored recurrent events. Statistics in Medicine, 43, 2622–2640. [code]
** Winner of the 2023 JSM Poster Award Competition Honorable Mention
- B. Colnet, I. Mayer, G. Chen, A. Dieng, R. Li, G. Varoquaux, J.P. Vert, J. Josse+, S. Yang+ (2024). Causal inference methods for combining randomized trials and observational studies: a review. Statistical Science, 1, 165–191. [arxiv]
- D. Gill, S. Lester, C. Free, A. Pfaff, E. Lversen, B. Reich, S. Yang, et al (2024). A diverse portfolio of marine protected areas can better advance global conservation and equity. Proceedings of National Academy of Sciences, 10.1073/pnas.2313205121.
- S. Yang+ and X. Zhang (2024). [Letter to Editor] Response to comment on “Transporting survival of an HIV clinical trial to the external target populations by Lee et al. (2024)”. Journal of Biopharmaceutical Statistics, https://doi.org/10.1080/10543406.2024.2373449.
- D. Lee*, C. Gao*, S. Ghosh, and S. Yang+ (2024). Transporting survival of an HIV clinical trial to the external target populations. Journal of Biopharmaceutical Statistics, doi.org/10.1080/10543406.2024.2330216. [arxiv] [code]
- D. Lee*, S. Yang, M. Berry, T. Stinchcombe, H. Cohen, and X. Wang (2024). genRCT: A Statistical Analysis Framework for Generalizing RCT Findings to Real-World Population. Journal of Biopharmaceutical Statistics, doi.org/10.1080/10543406.2024.2333136. [code]
- X. Mao, H. Wang, Z. Wang, and S. Yang (2024). Mixed dataframe matrix completion in survey under heterogeneous missingness. Journal of Computation and Graphical Statistics, doi.org/10.1080/10618600.2024.2319154. [arxiv]
- P. Zhao, J. Josse, A. Chambaz, and S. Yang (2024). Positivity-free Policy Learning with Observational Data. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics (AISTAT), PMLR, 238:1918-192. [arxiv] [code]
** Top 1% selected as an oral presentation
- S. Liu*, S. Yang+, Y. Zhang, and G. Liu (2024). Multiply robust estimators in longitudinal studies with missing data under control-based imputation. Biometrics, 80, ujad036. [arxiv]
- Q. Guan* and S. Yang+ (2024). A unified framework for causal inference with multiple imputation using martingales. Statistica Sinica, 34, 1649-1673. [arxiv]
- S. Yang, C. Gao*, X. Wang, and D. Zeng (2023). Elastic integrative analysis of randomized trial and real-world data for treatment heterogeneity estimation. Journal of the Royal Statistical Society: Series B, 85, 575-596. [arxiv]
- C. Gao*, S. Yang+, and J. K. Kim (2023). Soft calibration for correcting selection bias under mixed-effects models. Biometrika, 110, 897–911. [arxiv]
- C. Gao* and S. Yang (2023). Pretest estimation in combining probability and non-probability samples. Electronic Journal of Statistics, 17, 1492–1546. [arxiv]
- J. Chu*, S. Yang, and W. Lu (2023). Multiply robust off-policy evaluation and learning under truncation by death, Proceeding of the 40th (ICML) International Conference on Machine Learning, PMLR, 202, 6195–6227.
- J. Chu*, W. Lu, and S. Yang+ (2023). Targeted optimal treatment regime learning using summary statistics. Biometrika, 110, 913–931. [arxiv]
- Y. Guan, G. L. Page, B. J. Reich, M. Ventrucci, and S. Yang (2023). A spectral adjustment for spatial confounding. Biometrika, 110, 699–719. [arxiv]
- M. Yu*, W. Lu, S. Yang, and P. Ghosh (2023). Multiplicative structural nested mean model for zero-inflated outcomes. Biometrika, 110, 519–536.
- Y. Cheng*, L. Wu, and S. Yang (2023). Enhancing treatment effect estimation: a model robust approach integrating randomized experiments and external controls using the double penalty integration estimator. UAI (39th Conference on Uncertainty in Artificial Intelligence) 2023. [arxiv]
- S. Yang, Y. Zhang, G. Liu, and Q. Guan (2023). SMIM: a unified framework of Survival sensitivity analysis using Multiple Imputation and Martingale. Biometrics, 79, 230–240. [arxiv]
- S. Liu*, Y. Zhang, G. T. Golm, G. Liu, and S. Yang+ (2023). Robust analyses for longitudinal clinical trials with dropouts and non-normal continuous outcomes. Statistical Theory and Related Fields, 8, 1-14. [arxiv]
- S. Liu*, S. Yang+, Y. Zhang, and G. Liu (2023). Sensitivity analysis in longitudinal clinical trials via distributional imputation. Statistical Methods in Medical Research, 32, 181–194. [link] [arxiv]
- S. Yang and Y. Zhang (2023). Multiply robust matching estimators of average and quantile treatment effects. Scandinavian Journal of Statistics, 50, 235–265. [arxiv]
- L. Wu*, and S. Yang (2023). Transfer learning of individualized treatment rules from experimental to real-world data. Journal of Computation and Graphical Statistics, 32, 1036–1045. [arxiv]
- E. Cho* and S. Yang (2023). Variable selection for doubly robust causal inference. Statistics and Its Interface, accepted. [arxiv]
- D. Kong, S. Yang, and L. Wang (2022). Identifiability of causal effects with multiple causes and a binary outcome. Biometrika, 109, 265–272. doi:10.1093/biomet/asab016. [arxiv]
- Z. Jiang, S. Yang, and P. Ding (2022). Multiply robust estimation of causal effects under principal ignorability. Journal of the Royal Statistical Society: Series B, 84, 1423–1445. [arxiv] [code]
- D. Lee*, S. Yang+, L. Dong, X. Wang, D. Zeng, J.W. Cai (2023). Improving trial generalizability using observational studies, Biometrics, 79, 1213–1225. [arxiv] [code]
** Winner of the 2020 ENAR Distinguished Student Paper Competition Award
- D. Lee*, S. Yang+, and X. Wang (2022). Doubly robust estimators for generalizing treatment effects on survival outcomes from randomized controlled trials to a target population. Journal of Causal Inference, 10, 415-440. [arxiv] [code]
- S. Yang and X. Wang (2022). RWD-integrated randomized clinical trial analysis. 2022 ASA Biopharmaceutical Report Real World Evidence (Editors: Herbert Pang, Ling Wang, Kristi L. Griffiths), 29, 15–21.
- X. Mao, Z. Wang, and S. Yang (2022). Matrix completion under complex survey sampling. Annals of the Institute of Statistical Mathematics, doi:10.1007/s10463-022-00851-5. [arxiv]
- C. Gao*, K. J. Thompson, S. Yang and J. K. Kim (2022). Nearest neighbor ratio imputation with incomplete multinomial outcome in survey sampling. Journal of the Royal Statistical Society: Series A, 185, 1903-1930.
- J.Y. Wang, R. Wong, S. Yang, and G. Chan (2022). Estimation of Partially Conditional Average Treatment Effect by Hybrid Kernel-covariate Balancing. Electronic Journal of Statistics, doi.org/10.1214/22-EJS2000. [arxiv]
- A. B. Giffin*, B. J. Reich , S. Yang, and A. Rappold (2022). Generalized propensity score approach to causal inference with spatial interference. Biometrics, 79, 2220–2231. [arxiv]
** Winner of the 2021 ENAR Distinguished Student Paper Competition Award
- A. B. Giffin*, W. Gong, S. Majumder, A. Rappold, B. J. Reich, and S. Yang (2022). Estimating intervention effects on infectious disease control: the effect of community mobility reduction on Coronavirus spread. Spatial Statistics, 52, 100711. [arxiv]
- H. Zhao*, X. Zhang and S. Yang (2022). Double score matching in observational studies with multi-level treatments. Communications in Statistics – Simulation and Computation, doi.org/10.1080/03610918.2022.2118778.
- H. Zhao* and S. Yang (2022). Outcome-adjusted balance measure for generalized propensity score model selection. Journal of Statistical Planning and Inference, 221, 188–200. [arxiv]
** Winner of the 2021 DISS Best Poster Award
- D. Johnson*, K. Pieper, and S. Yang+ (2022). Treatment-specific Marginal Structural Cox Model for the Effect of Treatment Discontinuation. Pharmaceutical Statistics, 21, 988-1004.
- J. W. Yu, D. Bandyopadhyay, S. Yang, L. Kang, and G. Gupta (2022). Propensity score modeling in electronic health records with time-to-event endpoints: application to kidney transplantation. Journal of Data Science, 20, 188–208.
- M.Y. Huang and S. Yang+ (2022). Robust inference of conditional average treatment effects using dimension reduction. Statistica Sinica, 32, 547-567. [arxiv]
- A. Larsen*, S. Yang, A. Rappold, and B. Reich (2022). A spatial causal analysis of wildland fire-contributed PM2.5 using numerical model output. Annuals of Applied Statistics, 16, 2714-2731. [arxiv]
- L. Wu* and S. Yang+ (2022). Integrative R-learner of heterogeneous treatment effects combining experimental and observational studies. Proceedings of the 1st Conference on Causal Learning and Reasoning, PMLR, 140, 1–S5.
- B. J. Reich, S. Yang, and Y. Guan (2022). Discussion on “Spatial+: a novel approach to spatial confounding” by Dupont, Wood and Augustin, Biometrics, doi.org/10.1111/biom.13651.
- N. Corder* and S. Yang+ (2022). Utilizing stratified generalized propensity score matching to approximate blocked trial designs with multiple treatment levels. Journal of Biopharmaceutical Statistics, doi:10.1080/10543406.2022.2065507.
- Y. Zhang*, S. Yang, W. Ye, Douglas E. Faries, I. Lipkovich, Z. Kadziola (2021). Best practices of double score matching for estimating causal effects, Statistics in Medicine, 42, 1421–1445.
- S. Yang (2022). Semiparametric efficient estimation of structural nested mean models with irregularly spaced observations. Biometrics, 78, 937–949. [arxiv]
- B. J. Reich, S. Yang, Y. Guan, A. B. Giffin, M. J. Miller and A. G. Rappold (2021). A review of spatial causal inference methods for environmental and epidemiological applications. International Statistical Review, 89, 605-634. [arxiv]
- S. Yang, J. K. Kim, and Youngdeok Hwang (2021). Integration of data from probability surveys and big found data for finite population inference using mass imputation. Survey Methodology, 47, 29–58.
- F. Cools, D. Johnson, A. J. Camm, J. P. Bassand, F. Verheugt, S. Yang, A. Tsiatis, D. A. Fitzmaurice, S. Z. Goldhaber, G. Kayani, S. Goto, S. Haas, F. Misselwitz, A. Turpie, K. Fox, K. Pieper, A. K. Kakkar (2021). Risks associated with discontinuation of oral anticoagulation in newly diagnosed patients with atrial fibrillation: results from the GARFIELD-AR Registry. Journal of Thrombosis and Hemostasis, doi:10.1111/jth.15415. (Collaboration work)
- S. Yang, J. K. Kim, and R. Song (2020). Doubly robust inference when combining probability and non-probability samples with high-dimensional data, Journal of the Royal Statistical Society: Series B, 82, 445–465.
- S. Yang, K. Pieper, and F. Cools (2020). Semiparametric estimation of structural failure time model in continuous-time processes, Biometrika, 107, 123-136.
- N. Corder* and S. Yang (2020). Estimating average treatment effects utilizing fractional imputation when confounders are subject to missingness, Journal of Causal Inference, 8, 249-271.
- L. Dong*, E. Laber, Y. Goldberg, R. Song, S. Yang (2020). Ascertaining properties of weighting in the estimation of optimal treatment regimes under monotone missingness, Statistics in Medicine, doi: 10.1002/sim.8678.
- S. Yang and P. Ding (2020). Combining multiple observational data sources to estimate causal effects, Journal of American Statistical Association, 115, 1540–1554.
- S. Yang and J. K. Kim (2020). Statistical data integration in survey sampling: a review, Japanese Journal of Statistics and Data Science, 3, 625–650.
- W. Li*, S. Yang,+ and P. Han (2020). Robust estimation for moment condition models with data missing not at random, Journal of Statistical Planning and Inference, 207, 246–254.
- S. Yang and J. K. Kim (2020). Asymptotic theory and inference of predictive mean matching imputation using a superpopulation model framework. Scandinavian Journal of Statistics, 47, 839–861.
- S. Chen, S. Yang, and J.K. Kim (2020). Nonparametric mass imputation for data integration. Journal of Survey Statistics and Methodology, doi.org/10.1093/jssam/smaa036.
- S. Yang (2019). Book reviews: Flexible imputation of missing data, 2nd ed. Journal of American Statistical Association, 114, 1421–1421.
- S. Yang, L. Wang, and P. Ding (2019). Causal inference with confounders missing not at random, Biometrika, 106, 875–888.
- S. Yang and D. Zeng (2018). Discussion on penalized spline of propensity methods for treatment comparison by Zhou, Elliott and Little, Journal of American Statistical Association, 114, 30–32.
- S. Yang and J. J. Lok (2018). Sensitivity analysis for unmeasured confounding in coarse structural nested mean models, Statistica Sinica, 28, 1703–1723.
- S. Yang (2018). Propensity score weighting for causal inference with clustered data, Journal of Causal Inference, doi.org/10.1515/jci-2017-0027.
- S. Yang and J. K. Kim (2018). Nearest neighbor imputation for general parameter estimation in survey sampling, Advances in Econometrics, 39, 211–236.
- S. Yang and P. Ding (2018). Asymptotic inference of causal effects with observational studies trimmed by the estimated propensity scores, Biometrika, 105, 487–493. [arixv]
- Z. Wang, J. K. Kim, and S. Yang (2018). An approximate Bayesian inference under informative sampling, Biometrika, 105, 91–102.
- J. Lok, S. Yang, B.Sharkey, Hughes, M (2018). Estimation of the cumulative incidence function under multiple dependent and independent censoring mechanisms, Lifetime Data Analysis, 24, 201–223.
- S. Yang, A. A. Tsiatis, and M. Blazing (2018). Modeling survival distribution as a function of time to treatment discontinuation: a dynamic treatment regime approach, Biometrics, 74, 900–909.
- S. Yang and J. K. Kim (2017). A semiparametric inference to regression analysis with missing covariates in survey data, Statistica Sinica, 27, 261–285.
- J. K. Kim and S. Yang (2017). A note on multiple imputation under complex sampling, Biometrika, 104, 221–228.
- S. Yang and J. K. Kim (2017). Discussion: dissecting multiple imputation from a multi-phase inference perspective: what happens when god’s, imputer’s and analyst’s models are uncongenial? by X. Xie and X. L. Meng, Statistica Sinica, 27, 1568–1573.
- S. Yang, and J. J. Lok (2016). A goodness-of-fit test for structural nested mean models, Biometrika, 103, 734–741.
- S. Yang, and J. K. Kim (2016). Fractional imputation in survey sampling: a comparative review, Statistical Science, 31, 415–432.
- S. Yang, G. Imbens, Z. Cui, D. Faries and Z. Kadziola (2016), Propensity score matching and stratification in observational studies with multi-level treatments, Biometrics, 72, 1055–1065. With R package available “multilevelMatching“.
- S. Yang and J. K. Kim (2016). A note on multiple imputation for method of moments estimation, Biometrika, 103, 244–251.
- S. Yang and J. K. Kim (2015). Likelihood-based inference with missing data under missing-at-random, Scandinavian Journal of Statistics, 43, 436–454.
** Winner of the 2014 JSM Student Paper Competition Award
- L. Peyer, G. Welk, L. B. Davis, S. Yang, and J. K. Kim (2015). Factors associated with parent concern for child weight and parenting behaviors, Childhood Obesity, 11, 269–274. (Collaboration work)
- S. Yang and Z. Zhu (2015). Variance estimation and kriging prediction for a class of non-stationary spatial models, Statistica Sinica,25, 135–149.
- J. K. Kim and S. Yang (2014). Fractional hot deck imputation for robust estimation under item nonresponse in survey sampling, Survey Methodology, 40, 211–230.
- J. K. Kim, Z. Zhu, and S. Yang (2013). Improved estimation for June Area Survey incorporating several information, Proceedings 59th ISI World Statistics Congress, Hong Kong, China, 199–204.
- S. Yang, J. K. Kim and D. W. Shin (2013). Imputation methods for quantile estimation under missing at random, Statistics and Its Interface, 6, 369–377.
- S. Yang, J. K. Kim and Z. Zhu (2013). Parametric fractional imputation for mixed models with nonignorable missing data, Statistics and Its Interface, 6, 339–347.
** Winner of the 2024 ICSA Student Paper Award
** Winner of the 2023 ASA BIOP RISW Student Travel Award
** Winner of the 2024 ENAR RAB Student Poster Award Competition
Submitted papers
- A. B. Giffin*, B. J. Reich, S. Yang, and A. Rappold. Instrumental variables, spatial confounding and interference. [arxiv]
- S. Yang and Z. Zhu. Semiparametric estimation of spectral density and variogram with irregular observations. [arxiv]
- S. Xu*, S. Yang, B. J. Reich. A Bayesian non-parametric method for estimating causal quantile effects.
- X. Tan*, S. Yang, W. Ye, D. E. Faries, I. Lipkovich, Z. Kadziola. When doubly robust methods meet machine learning for estimating treatment effects from real-world data: a comparative study. [arxiv]
- T. Hong, W. Lu, S. Yang, and P. Ghosh. Multivariate choice models with irregularly spaced longitudinal observations: application to the lockdown effect on consumer behaviors. [arxiv]
- Y. Zhang, D. Kong, and S. Yang+. Towards R-learner of conditional average treatment effects with a continuous treatment: T-identification, estimation, and inference. [arxiv]
- Y. Zhang and S. Yang+. Semiparametric localized principal stratification analysis with continuous strata. [arxiv]
- T. Xu*, Y. Zhang, and S. Yang. Augmented match weighted estimators for average treatment effects.
- C. Cui, S. Yang, B. J. Reich, and D. Gill. Matching estimators of causal effects in clustered observational studies with application to quantifying the impact of marine protected areas on biodiversity. [arxiv]
- P. Zhao, J. Josse, and S. Yang. Efficient and robust transfer learning of optimal individualized treatment regimes with right-censored survival data.
[arxiv] - D. Faries, C. Gao, X. Zhang, C. Hazlett, J. Stamey, S. Yang, et al. Real effect or bias? Best practices for evaluating the robustness of real-world evidence through quantitative sensitivity analysis for unmeasured confounding. [Authorea]
- P. Ding, Y. Fang, D. Faries, S. Gruber, W. He, H. Lee, J.Y. Lee, P. Mishra-Kalyani, M. Shan, M. van der Laan, S. Yang, and X. Zhang (authors listed in an alphabetical order). Sensitivity analysis for unmeasured confounding in medical product development and evaluation using real world evidence. [arxiv]
- Z. Wang, S. Yang and J.K. Kim. Multiple bias-calibration for adjusting selection bias of non-probability samples using data integration. [arxiv]
- S. Yang and P. Ding. Two-phase rejective sampling. [arxiv]
** Winner of the 2023 ASA Section on Nonparametric Statistics Student Paper Award
Thesis
- S. Yang (2014). Fractional imputation method of handling missing data and spatial statistics. Iowa State University. [Link]