Email Share
Close
E-mail It

NOTE: Recipients' Email Address currently accepts only 5 email addresses separated by commas.

Bioinformatics Core

Event Archive

Past Bioinformatics Core Forums:

Watch archived presentations

11/16/11 - Fiona Brinkman, "Metagenomics Analysis of Watershed Microbiomes - Toward Improved Water Quality Monitoring"

5/5/11 - Gregory A. Petsko, "The Coming Epidemic of Age-Related Neurological Disease and What We're Trying to Do About It"

3/24/11 - Gary Bader, "Pathway Analysis of Genomics Data"

2/10/11 - Mark Boguski, "Health Communication at the Nexus of Social Media and Popular Culture"

Part 1.  12/09/10 - Manolis Kellis "Evolutionary, Chromatin and Activity Signatures for Interpreting the Human Genome and Disease Variants"

Part 2.  12/09/10 - Manolis Kellis "Evolutionary, Chromatin and Activity Signatures for Interpreting the Human Genome and Disease Variants" (part 2)

5/11/10 - Atul Butte "Personal Genomics: Exploring Systems Medicine Through Translational Bioinformatics"

2/16/10 - Owen White "Genome Bioinformatics and its potential application to epidemiology and medicine"

11/10/09 - Fritz Roth, "A Systems Genetics Sampler"

10/23/09 - Susanna-Assunta Sansone, "Standards and Infrastructure for Managing Experimental Metadata"

9/22/09 - Curtis Huttenhower, "Data mining for functional genomics and metagenomics"

 

Wednesday, November 16 2011
FXB G13, 12:30-2:00pm

Fiona Brinkman
Professor and Michael Smith Foundation for Health Research Senior Scholar,
Department of Molecular Biology and Biochemistry,
Simon Fraser University, Greater Vancouver, BC, Canada

"The Coming Epidemic of Age-Related Neurological Disease and What We're Trying to Do About It"


Access to clean drinking water is critical for maintaining public health. Currently, water quality is primarily assessed at the tap using approaches such as coliform counts that fail to detect the complete spectrum of water pathogens, or are too slow to be used as tools for real-time decision-making. What is required is water quality assessment that is more accurate, monitors upstream of the tap to identify problems sooner, and provides a method for identifying and remediating microbial pollution events. We are using a metagenomic approach to measure the health of pollution-impacted vs protected watersheds, plus potential pollution sources, over space and time. This first in-depth study of watershed microbiomes is comparing both shotgun and amplicon sequencing of protist, bacterial and viral sequences. Our overall aim is to develop novel molecular tools for the detection of a wider range of microbes which better reflect watershed health and facilitate microbial pollution source tracking. Initial data will be presented, along with new bioinformatics tools being developed to characterize further the gene content of the microbiomes and improve statistical analysis of molecular pathways (the latter relevant for analysis of any organism's pathways, including human).

Thursday, May 5 2011
FXB G12, 12:00-1:30pm

Gregory A. Petsko
Gyula and Katica Tauber Professor of Biochemistry and Chemistry, and Chair,
Department of Biochemistry, Brandeis University;
Adjunct Professor, Department of Neurology and Center for Neurologic Diseases,
Brigham & Women's Hospital and Harvard Medical School;
Associate Member, Tufts-NEMC Cancer Center

"The Coming Epidemic of Age-Related Neurological Disease and What We're Trying to Do About It"


A “perfect storm” of neurodegenerative disease is brewing across the globe, with hot spots in the U.S., Western Europe, China and Japan. Its threat is gradually rising, fueled by aging societies, a glacial and hugely expensive drug development process, and there are no real remedies in sight.  The forecast goes like this: in less than 50 years, more than a quarter of the world is going to be at least 65 years old – a staggering first in human history. Since age is the biggest risk factor for Alzheimer’s (AD) and Parkinson’s (PD), people will become increasingly vulnerable to these diseases (and others like stroke) over time. Today, about 4.5 million in the US have Alzheimer’s; in two decades that number will triple. About 60,000 people are diagnosed with PD each year; currently, 1.5 million Americans have it.  Five million will have it by 2050.  As a new approach to solving the growing problem of this next epidemic, we have established a Structural Neurology initiative to identify, characterize and develop drugs for proteins related to neurodegenerative diseases.

 A number of observations make such an approach attractive and potentially effective.  The first is that different neurodegenerative diseases have some common features, such as the presence in brain of protein aggregates that are thought to be at the root of neuronal death.  Second, the proteins found in aggregates are involved because of an inherent instability, causing their unfolding or inadequate folding.  The approach to prevention that we are taking is to target such proteins and develop pharmacological chaperones - sort of molecular Velcro -  to stabilize them.  The approach is best demonstrated by our success, in collaboration with Amicus Therapeutics, Inc., on the enzyme glucocerebrocidase, mutations in which lead to Gaucher's disease.  We and others have shown that most of the mutations destabilize the protein, leading to premature degradation or improper trafficking in the cell.  Although enzyme replacement therapy has improved the health of some affected individuals, oral treatment with pharmacological chaperones is as effective and reaches far more organ systems.  The design of small molecules that stabilize unstable  proteins may be a general therapeutic strategy for a number of diseases caused by protein misfolding and mistrafficking, such as the age-related neurologic disorders.

Thursday, March 24 2011
FXB G12, 12:00-1:30pm

 Gary Bader, Ph.D.
Assistant Professor,
The Donnelly Centre, University of Toronto
 

"Pathway Analysis of Genomics Data"


The 'active cell map' is the set of all interactions, complexes and pathways involving molecules in the cell and their activity under normal and diseased regulatory circumstances. We are developing novel computational methods to combine molecular network and profiles to uncover active cell map regions. We are interested in discovering processes are active in a given tissue and are differentially active between disease and non-disease tissues. I will focus on a novel statistical analysis and ‘active pathway’ visualization method we developed for application to the study of rare copy number variation associated with autism spectrum disorder. I will also describe software resources we are developing to help with pathway and network analysis of genomics data: Cytoscape, Pathway Commons and GeneMANIA.

 

Thursday, February 10 2011
Kresge G3, 12:00-1:30pm

 Mark Boguski M.D., Ph.D.
Associate Professor
Center for Biomedical Informatics
Harvard Medical School

"Health Communication at the Nexus of Social Media and Popular Culture"


In its report Healthy People 2020, DHHS states that one of its major objectives is to use communication strategically to improve health and that one of the ways in which this can be done is through images of health in the media and popular culture. Health information campaigns have traditionally relied on mass communication (such as public service announcements on billboards, radio and television) and educational messages in printed materials. However, fueled by social networking technologies and the emergence of “participatory medicine,” the ways in which consumers find and use health information are undergoing dramatic change.

Based on new insights into the theory and operational characteristics of “teachable moments”, and novel adaptations of theoretical models of health behavior change, we have created a multi-channel platform to systematically create and distribute Teachable Moments in Medicine® using blogs, FaceBook and Twitter. This system has the potential to educate and inform millions of consumers in a cost-effective manner since three-fourths of all Americans are online and virtually all take some interest in popular culture. The system has also proven popular among professional healthcare providers as a new mode of communication and understanding with their patients.

 

Thursday December 9
FXB G12, 12:00-1:30pm

Manolis Kellis, Ph.D.
Associate Professor of Computer Science, MIT
Computer Science and Artificial Intelligence Laboratory
 Broad Institute of MIT and Harvard

 "Evolutionary, Chromatin and Activity Signatures for Interpreting the Human Genome and Disease Variants"

 
Our group aims to further our understanding of the human genome by computational integration of large-scale functional and comparative genomics datasets. (1) Using multiple closely related species, we defined ‘evolutionary signatures’ for the systematic discovery of protein-coding genes, RNA structures, microRNAs, developmental enhancers, regulatory motifs, and biological networks. (2) Using epigenomics datasets of multiple chromatin marks across the complete genome, we defined ‘chromatin signatures’ that reveal numerous classes of promoter, enhancer, transcribed, and repressed regions, each with distinct functional properties. (3) Using diverse functional datasets across many cell types, we defined multi-cell ‘activity signatures’ for chromatin states, regulator expression, motif enrichment, and target gene expression, and used their correlations to link enhancers to target genes, infer activators and repressors, and predict and validate functional regulator binding. Together, these three signatures help elucidate regulatory circuits in the human and fly genomes, reveal new insights on animal gene regulation and development, and revisit previously uncharacterized disease-associated variants from genome-wide association studies, providing mechanistic insights into their likely molecular roles.

 

 

Thursday, October 21, 2010

FXB G13, 12:30-2:00pm

Franziska Michor

Associate Professor
Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute
Department of Biostatistics, Harvard School of Public Health

“Evolutionary Dynamics of Cancer”


Cancer emerges due to an evolutionary process in somatic tissue. The fundamental laws of evolution can best be formulated as exact mathematical equations. Therefore, the process of cancer initiation and progression is amenable to mathematical and computational investigations. I will discuss current interests in the lab pertaining to the dynamics of cancer initiation, progression, response to treatment, and evolution of resistance.

 

Thursday, September 23, 2010

FXB G13, 12-1:30pm

Lincoln Stein

Director, Informatics and Biological Computing Platform &
Senior Principal Investigator, Ontario Institute for Cancer Research 

“Interpreting Genome-Scale Data Using Reactome”


Reactome is a hand-curated knowledgebase of human biological processes and pathways that currently covers nearly 5000 genes. Using machine learning techniques, we have enhanced Reactome with high probability functional interactions to increase coverage to nearly half of the genome. This talk will describe how Reactome is built and maintained, and how it can be used to interpret genome-scale data from cancer and other diseases.

 

Tuesday May 11, 2010

FXB G12, 12-1:30pm

Atul Butte

Assistant Professor,
Stanford University School of Medicine

Director,
Center for Pediatric Bioinformatics,
Lucile Packard Children’s Hospital

 

"Personal Genomics: Exploring Systems Medicine Through Translational Bioinformatics"

 

Dr. Butte builds and applies tools that convert the billions of points of molecular, clinical, and epidemiological data measured by biomedical investigators and clinicians over the past decade into insights into diagnostic and therapeutic potential.  Dr. Butte will highlight how using publicly-available molecular data enables the discovery of new gene variants and biomarkers for diseases like diabetes, suggest novel roles for drugs in the treatment of disease, and for the first time allow us to probe the inner commonality across disease.   Dr. Butte will also discuss his recent publication in Lancet on a clinical evaluation of a patient presenting with a whole human genome sequencing.

 

Tuesday April 13, 2010

FXB G12, 12-1:30pm

Rami Kantor
Assistant Professor of Medicine
Brown University

 

"HIV global diversity and the evolution of drug resistance"

 

The HIV pandemic is still devastating worldwide. Although antiretroviral therapy can prevent morbidity and prolong longevity, and immense efforts are ongoing globally to increase its access, many people still die from this disease and numerous continue to get infected. A major obstacle to the success of antiretroviral treatment is the evolution of drug resistance to HIV medications, which can reduce their efficacy and lead to treatment failure. The majority of our knowledge on HIV drug resistance, as well as the design of all approved HIV medications, are based on HIV variants that circulate in resource-rich settings, such as North America and Europe. However globally, significantly different HIV variants predominate. Sequencing of HIV drug targets allowing for identification of drug resistance mutations is part of HIV clinical care and an intriguing research topic, encompassing many disciplines such as virology, bioinformatics, phylogenetics, statistics and clinical research. How diverse is HIV? Can the knowledge acquired on HIV drug resistance in resource-rich settings be applied to resource-limited settings? Are there differences in HIV drug resistance evolution among diverse variants? How significant are those differences? This talk will address those and other multi disciplinary issues.

 

 

Tuesday February 16, 2010

Kresge G3, 12-1:30pm

Owen White
Director of Bioinformatics for the School of Medicine
University of Maryland Baltimore

 

"Genome bioinformatics and its potential application to epidemiology and medicine"

 

The combination of genomic science and bioinformatics offers an unprecedented opportunity for infectious disease epidemiology. As one example, we have developed a comprehensive database that integrates epidemiological data with the annotated genomic sequence of microbial pathogens.  This system is a valuable tool for identifying pathogens meeting highly-specific disease criteria and for investigating the complex relationships between disease characteristics and genomic sequence for an organism or group of organisms.  It will also support the development of nucleotide and protein signature-based assays for the identification of pathogens or sets of pathogens. I will prevent general bioinformatics approaches that are used in context of genomic information with particular emphasis on its application to epidemiological research in my presentation. Faculty, graduate students and staff are welcome to attend.

 

 

Tuesday December 8, 2009

FXB G12, 12-1:30pm

Ben Voight - Research Scientist,
Medical Population Genetics
The Broad Institute of Harvard and MIT

"The Genetic Basis of the Human Condition"

 

The advent of high-throughput technologies and the resulting amassed genetic data have intensified interest in a fundamental question in human biology: What is the genetic basis and mechanisms underlying phenotypic diversity in human populations? My active research focuses on two facets of this central question. First, how can we identify specific causal genes (and associated causal alleles) that contribute to common traits, e.g., metabolic disorders, in humans? In addressing this question, I will describe efforts in genome-wide studies and meta-analysis for type-2 diabetes and discuss follow-up work on identified regions using next-generation sequencing technology. I will also comment on extensive replication efforts and fine-mapping using custom arrays and additional sequencing studies. The second facet deals with the identification of genetic risk factors across multiple medically relevant traits. Can this commonality be understood, thereby providing biological insight into mechanisms underlying the traits? I will describe a pilot study and methodological work which has identified a set of risk factors contributing susceptibility to multiple auto-immune traits and describe how this knowledge has informed our holistic understanding of the underlying biology. Finally, all genetic variation responsible for phenotypic diversity in humans has a population demographic and natural selective history. Using existing and next-generation data sets, a precise definition of the population dynamics for these variants is entirely obtainable. I will close with thoughts on experiments and designs toward this aim.

 

 

Tuesday November 10, 2009

FXB G12, 12-1:30pm

Fritz Roth - Associate Professor,
Tutor in Biochemical Sciences
Department of Biological Chemistry and Molecular Pharmacology
Harvard Medical School

 

"A Systems Genetics Sampler"

 

The talk will survey several topics:

  1. A quick update on large-scale quantitative function annotation for human genes;
  2. How a computational analysis of human 5'UTR introns led us to find that mRNAs encoding mitochondrial proteins use a non-canonical mRNA export pathway;
  3. Identifying synergistic drug combinations by mining genetic interaction networks; and
  4. Barcode Fusion Genetics (BFG), a new technology that identifies genetic interactions via large-scale parallel sequencing.

 

 

Friday October 23, 2009

FXB G11, 12-1:30pm

Susanna-Assunta Sansone -  Coordinator, with
Philippe Rocca-Serra, Technical Coordinator &
Eamonn Maguire, Software Engineer
The European Bioinformatics Institute
Cambridge, UK

"Standards and Infrastructure for Managing Experimental Metadata"

 

The presentation has a two-fold objective: to highlight the role of community-defined synergistic standards and introduce the development of the Investigation, Study and Assay (ISA) Infrastructure. This promotes and enables uptake of the standards through the provision of a set of freely available tools and a database, facilitating and assisting in the reporting and management of experimental metadata from a variety of multi-omics studies. The ISA infrastructure' components are based on the ISA-Tab format and designed for local use and can work independently, or as unified system:

- ISAcreatorConfig, for curators or power users to regulate the fields displayed in the ISAcreator; i.e., declaring certain fields mandatory (http://www.mibbi.org) or mandating the use of a specific set of ontology terms (http://www.obofoundry.org).

- ISAcreator, a 'user-friendly' editor with which experimentalists can construct reports, edit experimental metadata and ultimately validate it based on the configuration specified;

- The BioInvestigation Index, a relational database for storing and browsing the experimental metadata (an example is running as public prototype at: http://www.ebi.ac.uk/bioinvindex);

- ISAconverter, to transform ISA-Tab metadata into formats for submission to ArrayExpress (MAGE-Tab), PRIDE (Pride-xml) or the European Nucleotide Read Archive (SRAxml).

Dawn D*, Sansone SA*, Collis A*, ... Rocca-Serra P et al. ‘Omics data sharing. Science 9 Oct 2009. Vol. 326. no. 5950, pp. 234 - 236. http://biosharing.org 

ISA software and contact: http://isatab.sourceforge.net 

This work is supported by funds from the EU (CarcinoGenomics, NuGO), EMBL-EBI, UK's NERC-NEBC and the BBSRC.

 

 

September 22, 2009

FXB G12, 12-1:30pm

Curtis Huttenhower, Ph.D. -  Assistant Professor
Computational Biology and Bioinformatics
Department of Biostatistics
Harvard School of Public Health

"Data mining for functional genomics and metagenomics"

 

Bioinformatics in the context of public health is needed at a wide range of biological scales: molecular data describing cellular function, population studies incorporating genomic data, and the systems biology tying together these extremes.  At all of these levels, the scale of available data is large; public repositories of genomic data currently contain billions of experimental results from a variety of assays. While modern search engines have organized the size and heterogeneity of other complex systems such as the Internet, it remains an open question how machine learning can be used to mine large genomic data collections for answers to specific biological questions.

Curtis will discuss two algorithmic approaches to large scale human genomic data integration, both of which leverage tens of thousands of datasets to predict interaction networks, disease linkages, and regulatory modules. He will also present preliminary results applying this methodology to study genetic and epigenetic variation in a  ~1,000-subject colorectal cancer cohort.  Finally, he will briefly discuss data integration in the context of metagenomics, the study of uncultured microorganisms from environmental samples.  This emerging data-rich field presents a unique opportunity to bring large scale data integration to bear, particularly in the context of human microflora and their impact on health within hosts and across populations.

 

 

May 19, 2009

Kornelia Polyak, M.D., Ph.D. -  Associate Professor of Medicine
                                            Deparment of Oncology
                                            Harvard Medical School
                                            Dana-Farber Cancer Institute

"Breast Tumor Evolution"

 

Breast tumors are heterogeneous and composed of a variety of cell types with distinct genetic, epigenetic, and phenotypic profiles. The molecular basis underlying this intra-tumoral heterogeneity is poorly defined. Models that attempt to explain this include genetic and epigenetic diversity and stem cell-like characteristics combined with environmental selection for the most favorable phenotypes. These ideas have been investigated for a long time both in human tumors and in various model systems, leading to the accumulation of numerous findings that are used to support one or the other. Increasing data suggest that the cancer stem cell phenotype may just be a consequence of genetic and epigenetic events that occur in tumor cells and that it may change as tumors evolve. This high degree of intra-tumoral heterogeneity poses a challenge for efficient cancer therapy and prevention of disease progression. Identification the dependency of distinct tumor cell subpopulations on specific signaling pathways and developing combination of agents selectively targeting each of these likely to lead to the improved clinical management of cancer patients.  

  - Kornelia Polyak, Michail Shipitsin, Noga Qimron, Lauren L. Campbell

 

 

MARCH 17, 2009

Sir Richard J. Roberts, Ph.D. - 1993 Noble Prize Laureate, Medicine
                                                Chief Scientific Officer, New England Biolabs

“The Genomics of Restriction and Modification"


With more than 900 bacterial and archaeal genomes completely sequenced and the total sequence content of GenBank still growing exponentially, we can now gain some impression of the distribution of RM systems in the real world.  This has been accomplished by using computational analysis of these sequences to find genes or remnants of genes that show clear similarity to known restriction systems in REBASE.  This approach works well in identifying Type I and III systems, which show good conservation of sequence similarity.  For the Type II systems the V and C genes that accompany these systems are easily identified as are the methyltransferase genes.  However, the R genes are only detectable when they match known R genes of the identical or closely related specificity.  New R genes show up only as genes lying close to an M gene and themselves having no similarity to any other genes in GenBank. 

Surprisingly, these RM systems, or the relics of them, are much more abundant than might have been guessed from the classical biochemical screening of strains in the laboratory. In particular, Type I systems are widely distributed in Nature and many instances of solitary specificity subunits are found.  More than 400 potential Type III and 700 Type IV systems are found and on average about 4 DNA methyltransferase genes are found per genome.  Apparently solitary M genes, in which the R gene is either missing or non-functional, seem quite common.  However, our ability to identify M genes accurately is made difficult by the presence of conserved motifs in genes that methylate molecules other than DNA.  Analyses of the many environmental samples now appearing in GenBank suggests that the rate of evolution of both M and R genes is quite high and confirms previous findings that the direct cloning of intact RM systems into E. coli is quite difficult with current technology. Importantly, there is little reason to think that our current collection of more than 270 Type II specificities is more than a small sample of the specificities present in Nature.

New methods for predicting active restriction enzymes will be discussed as well as some new experimental approaches to testing the computational predictions.

 

 

February 24, 2009

Gabor T. Marth, D.Sc.

Assistant Professor
Department of Biology
Boston College

"Informatics Tools for Next-generation Sequence Analysis"

Next-generation sequencing technologies are now capable of producing tens of gigabases of useful data per machine run. This vast throughput led to the sequencing of several complete individual human genomes, and the 1000 Genomes Project is sequencing thousands of more individuals. The primary utility of these datasets is to discover single-nucleotide polymorphisms (SNPs) and short insertion-deletions (INDELs) at the single base pair resolution; and to map out structural variations (e.g. tranlocations, inverions) and copy number changes (e.g. deletions, duplications).

Current throughput and cost is sufficiently low to sequence smaller genomes with a fraction of a machine run’s worth of data. This enables whole-genome mutational profiling of model organisms and of pathogenic eukaryotes that are inaccessible with traditional genetics. Whole-genome mammalian resequencing at a high (>25X) read coverage, is still too costly for routine sequencing of thousands of samples e.g. in a case-control association study. Various DNA capture methods now offer an alternative solution for mass-scale resequencing of targeted gene regions.

Because of the swift evolution of sequencing technologies, and the rapid scale-up in data throughput software tools for next-generation sequence analysis are still in a state of flux. As next-generation human resequencing becomes more routine there is a growing need for efficient software and well-defined analysis pipelines. We developed a complete suite of software tools for mammalian-scale variation discovery. (1) Our read mapper/aligner program, MOSAIK, works with either single-end or paired fragment-end reads from 454, Illumina, AB, and Helicos machines. (2) Our Bayesian SNP / short-INDEL discovery program, GIGABAYES, now has algorithms for accurate individual genotype calling based on the aligned reads. (3) We developed a new program, SPANNER, for detecting structural variation events from paired-end read map positions, and quantifying copy number from the depth of read coverage. (4) Our alignment viewer program, EAGLEVIEW, aids visual data validation and hypothesis generation.

This pipeline was applied for SNP allele calling and SV/CNV detection in the multi-individual human genome resequencing data generated by the 1000 Genomes Project, including exon capture data collected for ~1,000 human genes. Our current work focuses on developing methods for highly accurate mutational profiling applications, and to tailor our tools for the analysis of expressed sequences in transcriptome sequencing projects.

 

 

December 16, 2008

Pierre R. Bushel
Yerby Visiting Assistant Professor of Bioinformatics
                          Department of Biostatistics
                          Department of Environmental Health

 

“Delineation of Perturbed Biological Systems that Govern Hepatotoxic Potential”


Exposure to hepatotoxicants, either from the environment, idiosyncratic drug responses or toxic doses of a chemical agent, is a major concern to human health and puts the public at risk.  Genomics has recently been used in an attempt to evaluate how environmental stressors affect cellular/tissue function and how changes in gene expression may relate to adverse effects.  We used a compendium of microarray gene expression data, derived from exposure of rats to hepatotoxicants, to identify a subset of genes that are perturbed preferentially from the toxic insult.  Using a variety of bioinformatics tools, computational algorithms and statistical methodologies we were able to discern key biological processes and molecular pathways that predict necrosis, and presumably govern the toxic responses in rat livers.  In addition, we were able to glean a more informative biological interpretation of surrogate (blood) genomic indicators in rats that conferred hepatotoxicity and permitted extrapolation to humans presented with acetaminophen intoxication or who exhibited an adverse response to supratherapeutic amounts of the analgesic/antipyretic medication.

 

 

OCTOBER 21, 2008

 

"Using Functional Information in Published Text to Interpret and Predict Genome-Wide Association Results"

 

 

                                                    

SEPTEMBER 15, 2008 - First Forum!

 

"Bioinformatics and Research at HSPH" 

 

Win Hide - Visiting Professor of Bioinformatics,
                 Department of Biostatistics, HSPH
                "When is Bioinformatics more than just useful?"

 

John Quackenbush - Professor of Computational Biology and
                               Bioinformatics, HSPH and DFCI
                              "Integrative Approaches to Understanding Human Disease"

 

X. Shirley Liu - Associate Professor, Department of Biostatistical Science,
                      HSPH and DFCI  

                     "Integrative modeling of transcription and epigenetic regulation"

 

Guocheng Yuan - Assistant Professor of Computational Biology and
                          Bioinformatics, Department of Biostatistics, HSPH
                          "A Genomic View of Epigenetic Regulation"

Special Bioinformatics Seminar

Friday April 30th 2010

 

Andrew Su
Associate Director, Bioinformatics
Genomics Institute of the Novartis Research Foundation

 

 

The Gene Wiki and BioGPS: Crowdsourcing for human gene annotation

 

The identification of all human genes through sequencing and assembly of the genome is largely complete.  Annotating the function of those genes is the next formidable challenge, and this process has only just begun.  Current gene annotation efforts largely rely on expert curation by centralized authorities.  Here, I will present two efforts to engage larger communities of scientists to collaboratively describe gene function.  The Gene Wiki applies this crowdsourcing model directly to the gene annotation process, and BioGPS uses the same principle to develop an extensible gene annotation database.