Research


Data Combination and Integrative Analyses

The research is closely in line with the 21st Century Cures Act, passed in 2016, which placed additional focus on the use of big real-world data (RWD) to support decision making and precision medicine. The availability of multiple data sources, namely randomized clinical trials (RCTs) and RWD, presents unique and novel opportunities for medical research, because the knowledge that can be acquired from integrative analyses would not be possible from any single-source analysis alone. Our research effort is important to bridge RCTs and vast real-world databases and registries arising from clinical practices in order to better understand how treatment works for the real-world and under-studied patient populations and provide accurate and reliable evidence for patient-centered care.

– In collaboration with Duke School of Medicine
– Supported by National Institute on Aging


Spatial and Temporal Causal Inference

Emerging smartphone applications provide unprecedented opportunities to study user behavior and health outcomes; they also present novel and ubiquitous challenges such as time-varying confounding, heterogeneous treatment effect over a large, environmentally-diverse domain, informative non-response, etc. We aim at developing a causal inference framework to study the relationship between intervention and health outcome from such mobile health data.

– In collaboration with US Environmental Protection Agency
– Supported by National Institute of Environment Health Science


Causal Inference with Spatial Data

Research progress in causal inference with spatial data has been slow due to complexities induced by spatial correlations and interference, i.e., the effect of treatment at one location depends on the response at nearby locations. Our research is to fill this critical gap and develop a suite of spatial causal inference methods to be applied to wide-ranging public-health problems with spatial data.

– In collaboration with US Environmental Protection Agency
– Supported by National Institute of Environment Health Science


Transform Real-World Data to Evidence

Many important questions in chronic diseases and cancer are about the effects of treatments, e.g. approving drugs, implementing health policies, or identifying optimal personalized treatment strategies. The answers to these questions often rely on complex real-world data suffering from confounding, non-compliance, drop-outs, missing values, irregular visit patterns, and etc. My research is to develop innovative statistical methods for making accurate inferences about treatment effects from complex observational and clinical studies, including marginal structural models, structural nested models, inverse probability weighting, matching methods and semiparametrics. We apply the novel methods in cardiovascular diseases, HIV infection and cancer research to identify effective treatment strategies.

– In collaboration with the AIEDRP study group, the international GARFIELD registry study group, real-world analytics at Eli Lilly and Company
– Partly supported by NCSU, ORAU, and National Science Foundation


Theory and methods for handling missing data

Missing data are frequently encountered in various disciplines. It is important, and many times critical, to handle missing data properly to avoid introducing bias in the data analysis. My research is to provide a principal statistical framework for handling missing data, including identification, matching theory and algorithms, multiple imputation, fractional imputation, weighting, so on.



We gratefully acknowledge grant support from