Research

See the Publications tab for a complete list of my work.

Collaborative work

Cystic fibrosis

CF is a progressive, genetic disease that causes mutations in the CF transmembrane conductance regulator (CFTR) gene, which leads to dysfunction in the CFTR protein. When this protein malfunctions, it is unable to help move chloride to the cell surface. Without chloride to attract water to the cell surface, the mucus in various organs becomes thick and sticky. In the lungs, the mucus clogs the airways and traps germs, leading to infections, inflammation, respiratory failure, and other complications. In the pancreas, the buildup of mucus prevents the release of digestive enzymes that help the body absorb food and key nutrients, resulting in malnutrition and poor growth. In the liver, the thick mucus can block the bile duct, causing liver disease. In men, CF can lead to infertility. Historically, CF has been extremely deadly, and the median life expectancy has only exceeded 30 years in the last few decades thanks to novel developments in clinical care and treatment. While incredible progress on therapies has been made, there is still a lot of work to be done. The ultimate goal for CF researchers is preventing CF from hindering people from living a fulfilling life, and eventually, finding a cure.

Erythropoietin and infant neurodevelopment

The Preterm Erythropoietin Neuroprotection (PENUT) Trial (NCT01378273) is a randomized, multi-center, placebo-controlled phase-III clinical trial that enrolled 940 infants born between 24 and 28 weeks of gestation at 19 centers across the US. The primary goal of the PENUT Trial is to assess whether high doses of erythropoietin (Epo) will improve survival without neurodevelopmental impairment at 2 years of corrected age (Juul et al., 2020). We also conducted post-hoc analysis on the relationships among Epo, packed red blood cell transfusions, iron intake, and neurodevelopmental outcomes in survived infants (Juul et al., 2020, Vu et al., 2021). The results from the PENUT Trial have provided insights on current clinical practices and helped with practical decision making for clinical care teams at Neonatal Intensive Care Units (NICUs) across the US.

Iron deficiency in preterm infants

Iron sufficiency plays an important role in neonatal brain development. The ability to assess iron level accurately is critical to supplementing infants timely and appropriately. Ferritin and zinc protoporphyrin-to-heme ratio (ZnPP/H) are two common measures of iron level used in neonates, however, it is not currently known how these two biomarkers are affected by clinical events, especially in extremely premature neonates. We conducted a retrospective study on infants admitted to the University of Washington NICU to establish the associations between the two biomarkers and to evaluate how they are affected by clinical inflammations or erythropoietic stimulating agents. The findings contribute to further understanding of iron deficiency in premature infants and suggest that current iron supplementation guidelines can be improved (German et al., 2017). We also extended our findings further by examining the associations of these markers with neurodevelopmental outcomes and assessing the potential of reticulocyte hemoglobin as an another early marker for iron sufficiency (German et al., 2019).

Methodological work

Dimension reduction for spatially-misaligned multivariate air pollution data with missing observations

Outdoor air pollution presents a major widespread health risk for large proportions of the population. These exposures can vary greatly across space and time, but typically are only measured at a small subset of times and locations. To analyze health impacts of exposures at locations where people live, statistical methods are needed to characterize and predict exposure data where there is no monitor to collect real time information. Although pollutants have historically been modeled separately, ambient air is a complex mixture of many different components that can interact with one another. As a result, modeling air pollution exposures is a challenging problem, as statistical models need to accommodate spatially irregular data, handle multi-dimensional data, and be computationally practical for large datasets. We developed a novel principal component analysis algorithm that reduces the dimension of multi-pollutant exposures into representative components, while simultaneously improving prediction accuracy and handling missing observations (Vu et al., 2020). We also tackled these challenges from a different analytic angle, using advanced convex optimization techniques in statistical machine learning. This proposed method offers a simple yet elegant algorithm that is easy to implement, computationally efficient, and potentially scalable in analyzing large air pollution datasets (Vu et al., 2021). We plan to extend these methods to the setting of misaligned and high-dimensional spatiotemporal data.