Core Competencies

The program is designed to provide students with the essential skills and competencies they will need to be key contributors in research projects involving the large, complex genomic datasets that are becoming increasingly common in all areas of biomedical, biological, and public health research.  

 Biological Background

  1. Working knowledge of molecular genetics, including Mendelian inheritance, complex trait inheritance, and the essentials of DNA and its role
  2. Working knowledge of the Central Dogma of Molecular Biology
    (DNA->RNA->protein), including an understanding of transcription, splicing, and translation
  3. Working knowledge of feature organization within the genome (including genes and regulatory regions) as well as the challenges of gene finding
  4. Familiarity with processes that regulate gene expression and protein translation, including transcription factors, miRNAs, etc
  5. Familiarity with epigenetic regulation, including DNA methylation and histone modification
  6. Familiarity with gene functional descriptions (such as the Gene Ontology) and basic signal transduction and metabolic pathways
  7. Familiarity with modern technologies, including genotyping and gene expression arrays, genome-seq, exome-seq, RNA-seq, ChIP-seq, etc, and their applications
  8. Understanding of metagenomics and its importance

 Bioinformatics Background

  1. Familiarity and ability to use the major genomics data resources, including those at NCBI, EBI, and UC Santa Cruz
  2. Understanding of sequence alignment algorithms
  3. Basic knowledge of gene finding methods and challenges
  4. Familiarity with gene functional annotation, including Gene Ontology and Pathway databases
  5. Ability to write simple scripts to download and transform data into useful formats
  6. Working knowledge of basic data analysis and data mining techniques such as hierarchical clustering, k-means clustering, PCA, SVD
  7. Working knowledge of basic statistical tests including t-tests, ANOVA, linear modeling, Fisher’s Exact Test, Kolmogorov-Smirnov statistics, chi-squared tests, and their applications
  8. Familiarity with Bayesian statistical approaches, MCMC, Gibbs Sampling, and HMMs
  9. Familiarity with machine learning approaches such as Bayesian Networks and Artificial Neural Networks, Classification and Regression Trees, and genetic algorithms
  10. Familiarity with bootstrapping, jackknifing, and sensitivity/recall/ROC curve analysis, and other empirical method
  11. Familiarity with modern network theories, including scale-free network models and their measures.

Computational Skills

  1. Working knowledge of UNIX
  2. Working knowledge in a scripting language such as Perl or Python
  3. Working knowledge with an advanced programming language such as C, C++, or Java
  4. Working knowledge of R/Bioconductor
  5. Familiarity with database programming and modern web technologies

Biostatistics Skills

  1. Fundamentals of experimental design
  2. Rates and proportions
  3. Parametric and non-parametric statistical methods
  4. Basic inference
  5. Regression and ANOVA
  6. Applied survival analysis
  7. Applied longitudinal analysis
  8. Bayesian statistical analysis

Epidemiology Skills

  1. Ability to critique the existing evidence for a particular research topic, review and summarize information from many studies
  2. Ability to develop a research question and formulate study objectives, define a set of related specific aims, write a research protocol for a given study question
  3. Ability to identify relevant ethical issues in a given study
  4. Ability to develop sampling procedures and be able to undertake calculations for sample size and power requirements
  5. Ability to identify methods of data collection appropriate to the study design and population
  6. Ability to design efficient data collection and data management procedures
  7. Ability to choose and use the techniques appropriate for estimation and hypothesis testing in selected situations
  8. Ability to perform data cleaning and data management operations to prepare for data analysis
  9. Familiarity with data cleaning and management techniques used to prepare unconventional data sources (such as large pharmacoepidemiology data, vital records, EMR, and cohort data) for causal inference exercises
  10. Familiarity with a comprehensive set of statistical methods suitable for a wide range of epidemiological situations; ability to summarize and present data in graphs and tables
  11. Familiarity with methods to assess and possibly correct for measurement error
  12. Familiarity with methods for managing missing data problems
  13. Ability to interpret the results of statistical procedures and draw appropriate conclusions