HBC’s Current Topics in Bioinformatics – 4/27

Harvard Bioinformatics Core’s Current Topics in Bioinformatics Workshop

Accessing public experimental NGS sequencing data
and associated genomic reference data

Friday, April 27th | 1-4pm | HSPH’s FXB Building, Room G12

For many types of next-generation sequencing (NGS) analyses, we need access to data stored in various public databases and repositories. This workshop will explore how to find and download publicly available sequencing data, such as data from published papers (FASTQ files and count matrices), using Gene Expression Omnibus (GEO) and the Sequence Read Archive (SRA) repositories. The workshop will also discuss public databases,  such as Ensembl, NCBI, and UCSC, the types of genomic reference data available from them and how to download this data.

During this workshop we will be accessing data using both a web browser and the command-line interface (shell/Linux/UNIX); therefore, a beginner knowledge of the command-line interface is a pre-requisite for this workshop. In addition, ability to use a high-performance computing cluster will be a bonus.