Ph.D. Track in Bioinformatics

Program Goals:

(1) To give the graduates of this program an adequate knowledge of statistical, computational and experimental methods, techniques and tools for the design and analysis of biological ‘omics’ (genomics, proteomics, metabolomics, phenomics) experiments, so that they may have successful academic careers in research institutions and universities or successful technical careers in industry.

(2) To train academic and industrial research statisticians who can manage and interpret biological ‘omics’ data and who can work with scientists from various disciplines related to Genetics, Bioinformatics and Biology.

Program Requirements:

The program starts with the current Statistics M.S. curriculum, which is designed to be finished in eighteen months. In addition, five courses are currently required in the Bioinformatics track in Statistics core curriculum: STAT 5504 Multivariate Statistical Methods, STAT 5444 Bayesian Statistics, STAT 5564 Statistical Genetics, GBCB 5314 Paradigms for Bioinformatics, and CSES 5844 Plant Genomics. In the case where not all of these courses can be offered, the Director of Graduate Programs in Statistics can grant dispensation for substitute courses. In the first year, students in this track are required to take all the M.S. core courses in statistics plus two more bioinformatics core courses (these can be taken later for students coming into the program with a M.S. degree obtained elsewhere). All the students in this track must pass the Statistics Qualifying Examinations at the Ph.D. level after the first year.

Besides the Statistics M.S. core curriculum and Bioinformatics track in Statistics core curriculum, the University requires all graduate students in the Bioinformatics track in Statistics to take two core curriculum courses plus the seminar in the University-wide Ph.D. program in Genetics, Bioinformatics and Computational Biology (GBCB):

(1) Two courses , one from each of two of the three secondary tracks* , for a statistics student (2×3c)
(2) GBCB 5004 Seminar in Genetics, Bioinformatics and Computational Biology (1c)

* The secondary tracks for a Statistics student are Computer Science (CS), Mathematics (Math), and Life Sciences (LS). A student who has fulfilled the Bioinformatics track in Statistics core curriculum has also fulfilled the GBCB LS course requirement but still needs to fulfill the GBCB CS (or Math) requirement. This can be accomplished by choosing from the following courses with consent from the advisor:

  • CS 5114: Theory of Algorithms
  • CS 5124: Algorithms in Bioinformatics
  • CS/Math 5485: Numerical Analysis and Software
  • CS/Math 5486: Numerical Analysis and Software
  • CS 5614: Database Management Systems
  • CS 5804: Introduction to Artificial Intelligence
  • CS 6104: Algorithms in Structural Bioinformatics
  • CS 6104: Systems Biology and Drug Discovery
  • CS 6604: Data Mining
  • Math 5515/16: Continuous / Discrete Mathematical Models

A different, suitable course can be chosen with consent from the advisor.

Lastly, students in the Bioinformatics track in Statistics are required to take four 6000-level courses in Statistics. STAT 6114 (Advanced Inference) is required. The remaining three courses can currently be chosen, with consent from the advisor, from the following courses: STAT 6424 (Multivariate Statistical Analysis), STAT 6514 (Advanced Topics in Regression), STAT 6494 (Advanced Bayesian Statistics), STAT 6504 (Experimental Design and Analysis), STAT 6404 (Advanced Topics in Nonparametric Statistics), STAT 6414 (Time Series Analysis II), STAT 6105 (Measure and Probability).

Students in the Bioinformatics Track will be required to take only one section of Stat 5984 (Special Topics in Statistics) instead of two.

The proposal defense, based on the dissertation topic, is required. The final examination toward the doctorate is the oral defense of the dissertation. The topic(s) of the dissertation must be related to the Bioinformatics track and must be approved by the dissertation committee members. The committee should consist of five faculty members including at least one member from outside of the Statistics Department with expertise in Genetics, Bioinformatics and Computational Biology.

Duration of the program: Students should finish all of the coursework in four years, and they should expect to complete their research project and dissertation in the fifth year.

Example of a Bioinformatics Statistics Ph.D Curriculum

  First Year

Fall
Spring
STAT5034 Inference Fundamentals 3 STAT5024 Stat. Consulting 2
STAT5044 Reg. and ANOVA 3 STAT5114 Stat. Inference 3
STAT5104 Prob. and Dist. Theory 3 STAT5124 Lin. Model Theory 3
STAT5014 Intro. to stat. Prog. 1 STAT5204 Exp. Design & Analysis 3
GBCB 5314 Paradigms for Bioinformatics 3  

Qualifying Exams to be taken after Spring semester.  

Second Year

Fall
Spring
Summer
STAT 5504 Multivariate Statistics 3 STAT 6114 Adv. Inference 3 Disseration research proposal preparation
STAT5444 Bayesian Statistics 3 STAT5564 Statistical Genetics 3
STAT5514 Regression Analysis 3 STAT6424 Advanced Multivariate 3
CSES/GBCB 5844 Plant Genomics 3 STAT6494 or CS/Math/LS core Advanced Bayesian 3

  Third Year

Fall
Spring
Summer
CS/Math/LS core or Stat/CS/Math/GBCB elective   3 STAT6514 Adv. Topics in Reg. 3 STAT7994 Research & Dissertation  
STAT5984 Special Topics 1 STAT6494 or CS/Math/LS core or Stat/CS/Math/GBCB elective Advanced Bayesian 3
GBCB5004 Seminar 1 STAT7994 Research & Dissertation  
STAT7994 Research & Dissertation  

Fourth Year

Fall
Spring
Summer
STAT7994 Research & Dissertation   STAT7994 Research & Dissertation   STAT7994 Research & Dissertation