Course Description

Introduces the fundamentals of data science with specific application to biology. Through a practical, problem-based approach, students will examine the theory and practice underlying widely used computational methods in biology. They will develop mastery in the analysis and visualization of large data sets using Python, with applications to genomics, ecology, and other areas of biology. Students will test hypotheses, infer dataset parameters, and make predictions via broadly applicable data science tools.

Instructor

Timothy Warren
tim.warren AT oregonstate.edu

Course Assistants

Nathaniel Davidson (Head) davidson AT oregonstate.edu

Vini Karumuru karumurv AT oregonstate.edu

Divyansh (Divy) divyans AT oregonstate.edu

Syllabus

Weekly Calendar

Date Topic Relevant Reading Assignment
Week 1
01/09, 01/111       
Course Goals and Philosophy
Unix Shell Scripts
Introduction to Pandas     
Jupyter Notebook           
Unix Shell
Python Examples    
Pandas 10 min Reference
Pandas tutorial
HW0 (for students who did not take BDS 310) Due Fri 1/12
HW 1
Due Mon 01/22    
       
Week 2
01/16, 01/18
Analyzing Tabular Data with Pandas
Time Series and Visualization
HW 2
Due Mon 01/26
       
Week 3
01/23, 01/25
Pandas synthesis; Using version control to organize your work matplotlib tutorial
Edward Tufte
Data Visualization textbook
HW 3
Due Mon 02/04
       
Week 4
01/30, 02/01
Version Control; Introduction to Random Processes Inferential Thinking: Chapter 9, Randomness
Inferential Thinking: Chapter 10, Sampling
HW 4
Due Mon 02/11
       
Week 5
02/06, 02/08
Application of Random processes: Permutation Testing Inferential Thinking: Chapter 11: Testing Hypotheses
Chapter 12: Comparing Two Samples
Illustrated permutation test
HW 5
Due Mon 02/19
       
Week 6
02/13, 02/15
Resampling for hypothesis and estimation - the bootstrap Chapter 12: Comparing Two Samples
Illustrated permutation test
Chapter 13: Testing Hypotheses
Bootstrap schematic
News article on origin of bootstrap
HW 6
Due Mon 02/26
       
Week 7
02/20, 02/22
Testing and predicting relationships in data:
Regression and Correlation
In-class quiz - Date TBD
HW 7
Due Mon 03/04
       
Week 8
02/27, 02/29
Regression; Error-minimization for Model Fitting   HW 8
Due Mon 03/11
       
Week 9
03/05, 03/07
Introduction to Optimization and Machine Learning   HW 09
Due Tues 03/19
       
Week 10
03/12, 03/14
Putting it all together    


Link to hw04 directions: hw04_directions

Link to github pages website creation instruction: GH_pages

Link to github classroom assignment creation and cloning GH_classroom