Academic job market candidate

Hufeng Zhou, PhD

I build statistical methods, functional annotation resources, and production-quality software for understanding human genetic variation and disease mechanisms at whole-genome scale.

View Publications Download CV Research Program GitHub Contact

45+: peer-reviewed publications
15+: years in computational biology
3: editorial service roles
2026: CV and publications updated

Profile

Computational biology for genomic medicine

My work connects statistical genetics, functional genomics, and robust software systems so large-scale sequencing studies can move from raw variants to biological insight.

Research Scientist, Harvard T.H. Chan School of Public Health; former Instructor, Harvard Medical School and Brigham and Women's Hospital.

I focus on annotation-informed rare-variant association methods, functional annotation resources such as FAVOR, EBV-associated cancer epigenomics, biomedical AI agents, AI-driven genomic and pathology data science, host-pathogen computational biology, and software infrastructure that helps research consortia analyze large sequencing datasets.

Statistical genetics Whole-genome sequencing Functional annotation Biomedical AI agents AI and digital pathology EBV epigenomics Scientific software

Biomedical AI agents

Local, auditable agents for genomic discovery

I am creating AI agents for biomedical research applications, including IGVFagent, a local and auditable agent for discovering, retrieving, and analyzing data from the IGVF ecosystem and related public resources.

IGVFagent is designed to work across the IGVF Portal, Catalog, Knowledge Graph, ENCODE, FAVOR, publications, assay data, and analysis tools while keeping the reasoning path inspectable through a Plan → Action → Results → Evaluation loop.

AI agent research GitHub

IGVF Portal IGVF Catalog Knowledge Graph ENCODE FAVOR

AI IGVFagent local, auditable, evidence-aware

Plan Action Results Evaluation

Variant scoring and interpretation
Multi-omic integration and enhancer-gene mapping
Fine-mapping, trajectory inference, and cross-tissue analysis
Evidence cross-checking, provenance, and human feedback

Research

A coherent program from methods to mechanisms

These areas connect statistical genetics, functional annotation, AI, epigenomics, software infrastructure, and host-pathogen systems biology.

Population genetics and rare variants

Scalable methods for large whole-genome sequencing studies, with emphasis on rare-variant association testing, noncoding regions, multi-trait analysis, time-to-event outcomes, and biobank-scale inference.

Explore project

Variant annotation infrastructure

FAVOR, FAVOR 2.0, FAVORannotator, and FAVOR-GPT translate genome-wide functional annotation into searchable resources and analysis-ready formats for human genetics.

Explore project

EBV epigenomics and gene regulation

Integrative genomic studies of Epstein-Barr virus transcriptional regulation, super-enhancers, enhancer RNAs, viral oncoproteins, and host chromatin architecture.

Explore project

AI agents, genomics, and digital pathology

Local, auditable AI agents plus machine-learning and deep-learning systems for variant interpretation, multi-omic integration, WGS quality control, and H&E pathology risk modeling.

Explore project

Host-pathogen computational biology

Computational approaches for protein-protein interaction prediction, pathway data integration, microbial systems biology, and molecular diagnostic collaborations.

Explore project

Scientific software and reproducible pipelines

Open software, AI agents, data resources, and analysis pipelines that turn statistical methods into practical tools for large consortia and biomedical collaborators.

Explore project

Li X, Chen H, Selvaraj MS, Van Buren E, Zhou H, Wang Y, Sun R, McCaw ZR, Yu Z, Jiang MZ, DiCorpo D, Gaynor SM, Dey R, Arnett DK, Benjamin EJ, Bis JC, Blangero J, Boerwinkle E, Bowden DW, Brody JA, Cade BE, Carson AP, Carlson JC, Chami N, Chen YI, Curran JE, de Vries PS, Fornage M, Franceschini N, Freedman BI, Gu C, Heard-Costa NL, He J, Hou L, Hung YJ, Irvin MR, Kaplan RC, Kardia SLR, Kelly TN, Konigsberg I, Kooperberg C, Kral BG, Li C, Li Y, Lin H, Liu CT, Loos RJF, Mahaney MC, Martin LW, Mathias RA, Mitchell BD, Montasser ME, Morrison AC, Naseri T, North KE, Palmer ND, Peyser PA, Psaty BM, Redline S, Reiner AP, Rich SS, Sitlani CM, Smith JA, Taylor KD, Tiwari HK, Vasan RS, Viali S, Wang Z, Wessel J, Yanek LR, Yu B, NHLBI TOPMed Consortium, Dupuis J, Meigs JB, Auer PL, Raffield LM, Manning AK, Rice KM, Rotter JI, Peloso GM, Natarajan P, Li Z, Liu Z, Lin X.

Nat Comput Sci. 2025;5(2):125-143.

DOI PubMed

Full publication list Google Scholar

Software and resources

Tools that make genome-scale studies usable

FAVOR, FAVORannotator, STAAR, STAARpipeline, WGSagent, and cellSTAAR show a consistent thread: methods that are not only statistically rigorous, but usable by large collaborations.

Software portfolio