Harvard Chan Bioinformatics Core: 2017 Year-End Update

Happy holidays from the Harvard Chan Bioinformatics Core! This year we welcomed two new members. Dr. Michael Steinbaugh joined us from the Joslin Diabetes Center as a Research Associate with expertise in the biology of aging, transcription and metabolism, and after a productive summer internship with us, Kayleigh Rutherford is now a full time Research Assistant in our team. We were delighted to celebrate the promotions of Dr. Lorena Pantano to Research Scientist, and Dr. Brad Chapman to Senior Research Scientist.

We continued to provide bioinformatics consulting services to Harvard-affiliated investigators through individual collaborations as well as through our ongoing relationships with Harvard Catalyst, the Harvard Stem Cell Institute (HSCI), Harvard Medical School Tools and Technology (TnT), the Harvard NeuroDiscovery Center (HNDC), the National Institutes for Environmental Health Sciences (NIEHS), and the Center for AIDS Research (CFAR). This year, our consultants collaborated on 14 published papers, and supported 43 grant applications and 149 consult requests.

It was also a busy year for our training team, who continued to offer our popular short workshops and in-depth courses focused on empowering researchers to perform their own NGS analyses. These included two long workshops on best practices for RNA-seq, ChIP-seq and variant calling analysis; four introductory R workshops; three introduction to RNA-seq and high performance computing workshops; and an introduction to Differential Gene Expression Analysis. We were excited to launch a monthly workshop series entitled “Current topics in Bioinformatics”, were involved in three Catalyst workshops, participated in development of the Catalyst Online “Introduction to ‘Omics course”, and provided assistance to HSPH researchers at the FAS RC Office Hours. It was our pleasure to host two interns from the Human Heredity and Health in Africa (H3Africa) Initiative: Amel Ghouila (Tunisia) and David Adjei (Ghana) spent three months working with Dr. Marc Lipsitch and Dr. Peter Kraft, respectively.

The core continued development of the bcbio platform with Harvard and industry partners. Bcbio is a python toolkit that provides best-practice pipelines for fully automated high throughput sequence analysis. Highlights from this year include significant improvements to our single cell analysis pipeline, enabling us to handle millions of single cells, and expansion of bcbio to handle digital gene expression (DGE) data. We improved our variant detection methods for difficult somatic variant cases like FFPE and ctDNA, incorporated new germline based calling methods (including the freely available GATK4 due in January), and expanded where bcbio runs using community workflow representations (bcbio now runs on DNAnexus, Arvados and Seven Bridges). Our researchers also developed three new R packages that provide quality control plotting tools and analysis for bulk RNA-seq (bcbioRNASeq, published in F1000), single cell RNA-seq (bcbioSingleCell) and small RNA-seq (bcbioSmallRna).

Lastly, we continued to work with the community to develop best practices, benchmarks and interoperable workflow tools, both through the GA4GH working groups and Open Bioinformatics Codefest events. Thank you to all our fellow researchers for making this an exciting and productive year, and all the best for 2018!