Research

My current research interests are in three major areas:

  • Developing novel deep learning algorithm to better annotate and classify pathogenic non-coding variants through integrative functional and epi-genomics data. And applied the method in the discovery of multiple important human diseases (e.g. Small Cell Lung Cancers, Nasopharyngeal carcinoma, EBV-associated lymphoproliferative disorders).
  • Developing a most comprehensive functional annotation database for genetic variants (including protein function, conservation, epigenetics, microRNAs, integrative scores, etc.) to better understand and better study genetic variants in WGS/WES/GWAS association studies, especially useful in boosting the set-based test.
  • Developing QC method for the large scale WGS data (including variant filter using HWE, Call Rate, LCR, etc. and sample exclusion including Call Rate, Coverage, Ti/Tv ratio, het/hom ratio, etc.) to normalize and remove the batch effects from different sequencing centers and potential sequencing errors so as to better facilitate the studies of large scale WGS data.
  • Exploring the fundamental epigenetic mechanisms of EBV associated Cancers, using Next-Generation Sequencing technologies, including Whole Genome/Exome Sequencing (WGS/WES),ChIP-Seq, RNA-seq, ATAC-seq,GRO-seq, HiC, ChIA-PET,4C-seq, etc.
  • Investigate EBV Super-Enhancers regulatory effects through non-coding RNA (ncRNA) especially Enhancer RNA(eRNAs) and long-range interactions (ChIA-PETand 4C-seq) and build a comprehensive eRNAregulatory networkthat plays important role in the formation of EBV associated lymphomas. And I am also working on the Enhancers’ and Super Enhancers’ regulation on microRNAsthrough long-range interactions.
  • Use 3D genomics (HiCand ChIA-PET) and Functional Information (Gene Ontologyand Pathway) to identify the disease-related variants(SNPs, INDELs, CNVs, etc.) from common variants, and illuminate how they influence the gene expressions and phenotype through integrative genomicsand machine learning

Besides, I also have years of research experiences in Protein-Protein Interactions (PPI) related studies; Pathwaydataintegrationandanalysis;Human-Microbiome Metagenomicdataanalysis;data miningand machine learningapplication on clinical data, software and databases development based on these research areas. The experience in PPI and pathway networks serves as important source of creativity in my current genomic/epigenomic research.

 

Email: hzhou@hsph.harvard.edu,  hufengzhou@g.harvard.edu, hzhou@broadinstitute.org

Phone Number: 617-903-8599