Biobank-scale human genetics

Population Genetics

Research on scalable whole-genome sequencing analysis, rare-variant association testing, functional annotation, and genetic discovery in diverse populations.

Overview

Population Genetics

Large whole-genome sequencing studies now measure rare coding and noncoding variation at a scale that requires statistical, computational, and data-infrastructure advances. My work develops and applies annotation-informed methods that increase power, preserve rigorous control of population structure and relatedness, and make variant interpretation practical for large consortia.

  • STAAR and STAARpipeline for annotation-informed rare-variant association testing.
  • metaSTAAR for powerful and resource-efficient meta-analysis across large WGS and WES studies.
  • MultiSTAAR and cellSTAAR for multi-trait and single-cell-informed rare-variant analysis.
  • Cross-consortium applications in TOPMed, GSP, lung cancer, sleep, lipids, and multi-ancestry resources.
Population Genetics figure

Publications

Related work

Representative publications connected to this project.

2026

Scalable and accurate rare-variant association tests for whole genome sequencing time-to-event analysis in large biobanks.

Song S, Li X, Zhou H, Li Z, Lin X.

Proc Natl Acad Sci U S A. 2026;123(9):e2525288123.

2025

A statistical framework for multi-trait rare variant analysis in large-scale whole-genome sequencing studies.

Li X, Chen H, Selvaraj MS, Van Buren E, Zhou H, Wang Y, Sun R, McCaw ZR, Yu Z, Jiang MZ, DiCorpo D, Gaynor SM, Dey R, Arnett DK, Benjamin EJ, Bis JC, Blangero J, Boerwinkle E, Bowden DW, Brody JA, Cade BE, Carson AP, Carlson JC, Chami N, Chen YI, Curran JE, de Vries PS, Fornage M, Franceschini N, Freedman BI, Gu C, Heard-Costa NL, He J, Hou L, Hung YJ, Irvin MR, Kaplan RC, Kardia SLR, Kelly TN, Konigsberg I, Kooperberg C, Kral BG, Li C, Li Y, Lin H, Liu CT, Loos RJF, Mahaney MC, Martin LW, Mathias RA, Mitchell BD, Montasser ME, Morrison AC, Naseri T, North KE, Palmer ND, Peyser PA, Psaty BM, Redline S, Reiner AP, Rich SS, Sitlani CM, Smith JA, Taylor KD, Tiwari HK, Vasan RS, Viali S, Wang Z, Wessel J, Yanek LR, Yu B, NHLBI TOPMed Consortium, Dupuis J, Meigs JB, Auer PL, Raffield LM, Manning AK, Rice KM, Rotter JI, Peloso GM, Natarajan P, Li Z, Liu Z, Lin X.

Nat Comput Sci. 2025;5(2):125-143.

2026

cellSTAAR: incorporating single-cell-sequencing-based functional data to boost power in rare variant association testing of noncoding regions.

Van Buren E, Zhang Y, Li X, Selvaraj MS, Li Z, Zhou H, Palmer ND, Arnett DK, Blangero J, Boerwinkle E, Cade BE, Carlson JC, Carson AP, Chen YI, Curran J, Duggirala R, Fornage M, Franceschini N, Graff M, Gu C, Guo X, He J, Heard-Cosa N, Hou L, Hung YJ, Kalyani RR, Kardia SLR, Kenny E, Kooperberg C, Kral BG, Lange L, Levy D, Li C, Liu S, Lloyd-Jones D, Loos RJF, Manichaikul AW, Martin LW, Mathias R, Minster RL, Mitchell BD, Mychaleckyj JC, Naseri T, North K, O'Connell J, Perry JA, Peyser PA, Psaty BM, Raffield LM, Vasan RS, Redline S, Reiner AP, Rich SS, Smith JA, Spitzer B, Tang H, Taylor KD, Tracy R, Viali S, Yanek L, Zhao W, NHLBI TOPMed Consortium, Rotter JI, Peloso GM, Natarajan P, Lin X.

Nat Methods. 2026;23(2):338-349.