February 3, 2006
Mini-Courses in Bioinformatics to Begin This Month

HSPH Associate Professor Dieter Wolf studies the mechanisms and functions of proteins, seeking to understand patterns of expression. Part of his work is tedious and time-consuming-information about a protein gets plugged into an online database to see if it falls into any established functional groups. But he recently attended a discussion organized by the Bioinformatics Core Group at HSPH that may ease that task. Jill Mesirov of the Broad Institute of MIT and Harvard described a free software application suite for the analysis of gene expression. The software may be able to cluster proteins by function, as it does genes. Wolf expects to meet with Mesirov again to explore the possibilities.

"The opportunity seems interesting," said Wolf. "We are struggling with the issue of making sense of entire protein expression signatures-not just single proteins-without having sophisticated enough tools and databases to link the signatures to actual phenotypes."

Wolf's situation exemplifies one goal of the Bioinformatics Core Group-to provide ample opportunities for HSPH members to discuss quantitative ideas and to learn information about using bioinformatics in their research, as well as to form collaborations within the School and with outside groups.

The core was launched in 2004 in response to the need for improved analysis of massive quantities of data produced by endeavors such as the Human Genome Project. Offered are monthly forums, tutorials, workshops, a consultant service, and guidance on methodologies in the field.

When the Core launched, L.J. Wei, professor of biostatistics in the Department of Biostatistics and member of the advisory group for the Core, explained to HPH NOW: "One of the purposes of bioinformatics is to reduce the number of experiments that need to be done to achieve reliable information. However, an issue right now is that there are huge data sets that can be run through different kinds of software programs, ending up with many data points. Unless we understand and use bioinformatics well, we may not even know which of those data points are important."

Recently, eight small grants were provided from Dean's Initiative Funds to promote bioinformatics projects and collaborations at the School. See box below for examples of recipients.

This spring, the Bioinformatics Core Group will also present mini-courses free of charge that are designed to give a far more intensive exploration of a specific topic than a single presentation. The mini-courses may last a few hours over the course of several days. See box below.

The first one will start on February 10 and will focus on machine learning methods in high-throughput biological data analysis. Machine learning refers to a system capable of the autonomous acquisition and integration of knowledge. This capacity to learn from experience, analytical observation, and other means, results in a system that can continuously self-improve and thereby offer increased efficiency and effectiveness. The algorithms are designed to "learn" from experience and dig out knowledge from noisy data such as DNA microarray experiments.

The Bioinformatics Core Group will present three other mini-courses in March, April, and May on SNPs analysis, protein-protein interaction and gene networks, and proteomics mass-spectrometry. The core is open to suggestions on other mini-course topics.

"The field of bioinformatics is advancing rapidly and impacts a broad range of research," said Xin Lu, a research scientist in the Department of Biostatistics who coordinates the core. "We want members of the HSPH community to come to us, learn about the amazing opportunities that may exist for them to collaborate with others, and brainstorm about ways we can help them take advantage of this cutting-edge field."

To learn more, visit http://www.hsph.harvard.edu/bioinfocore/. To subscribe to the mailing list, email xinlu@hsph.harvard.edu or go to http://hsphsun3.harvard.edu/mailman/listinfo/binfcore.

Attend Mini-Courses on Bioinformatics Starting this Month

The Bioinformatics Core is pleased to present the first in a series of spring mini-courses. Professor Xuegong Zhang from Tsinghua University, China, will present a series on "Machine Learning Methods in High-Throughput Biological Data Analysis" as follows:

  • Friday, February 10, 1:30-3:30pm (Kresge G3): Overview of machine learning and pattern recognition methods and their applications, including: hierarchical clustering, Fisher's LDA, "nearest neighbor," K-means clustering and perception
  • Friday, February 17, 1:30-3:30pm (Kresge G3): Detailed discussion on unsupervised and supervised learning, and feature selection, including: artificial neural networks (MLP and SOM), support vector machine and statistical learning theory, and recursive feature selection and classification
  • Friday, February 24, 1:30-3:30pm, (Kresge LL-6): Hands-on computer lab training on "R" and machine learning algorithms. Basic introduction of "R" environment and bioconductor packages, and the usage of hierarchical clustering, SOM, LDA, KNN, SVM, and R-SVM for microarray data analysis
  • RSVP to xinlu@hsph.harvard.edu for these sessions. There is limited space available in the computer lab.

    The tentative schedule of mini-courses for the rest of the spring is:

    March: SNPs
    April: Protein-Protein Interaction
    May: Mass Spectrometry

    A Sampling of HSPH Bioinformatics Projects

    "Using EM Algorithm and Bayesian Approach to Infer Copy Numbers from SNP Arrays"

    "Proteomics and Bioinformatics Approaches for Identification and Verification of Serum Biomarkers to Arsenic and Lead Mixture Exposure"

    "Bioinformatic Database for Radiation Genomics"