SOCR News MIDAS Biomedical Bootcamp 2021
Contents
SOCR News & Events: 2021 MIDAS Data Science for Biomedical Scientists Bootcamp
The Michigan Institute for Data Science (MIDAS) is organizing a week-long Data Science for Biomedical Scientists Bootcamp. This workshop will introduce data science from a biomedical perspective. Bootcamp participants will learn about practical data science applications in biomedical and health case-studies. Modern data science, machine learning, artificial intelligence, and biostatistical methods will be integrated into the training curriculum.
Instructors
- Kayvan Najarian
- Nambi Nallasamy
- Ivo Dinov, University of Michigan, SOCR, MIDAS.
- Michael Mathis
- Ryan Stidham
- Jonathan Gryak
- Michael Sjoding
Workshop Logistics
- Dates/Times: Monday through Friday, July 26-30, 2021, 7:00-16:00 US-EDT (daily).
- Registration: Registration Link.
- URL: MIDAS Bootcamp Website.
- Session Format: Two daily sessions (4-hours each).
- Session URL.
Overview
- Target Audience: This workshop is open to all biomedical scientists. The curriculum is geared towards junior faculty members who plan to incorporate data science in their scholarly work.
- Prerequisite: College level math and statistics.
- Main components:
- Math and algorithmic foundations for data science
- Key concepts of data science
- Introduction to Python programming
- Machine learning, support vector machine, artificial neural network, deep learning
- Example of biomedical research projects with data science
- Incorporating data science in biomedical grant proposals
Program Schedule
Day | Time | Instructor | Session Topic | Content |
---|---|---|---|---|
Monday | 7:00 - 8:30 AM | Kayvan Najarian | Session 1: Welcome and introduction to the program | A review of the program and logistics |
Why data science, artificial intelligence, and machine learning? | ||||
8:30 - 8:45 AM | Break | |||
8:45 - 10:15 AM | Ivo Dinov | Session 2: Math foundations I – Brief introduction to mathematical foundations of machine learning | Math notation and fundamentals | |
Linear Algebra and Matrix Computing | ||||
Optimization theory | ||||
Differential Equations | ||||
Calculus of Differentiation & Integration | ||||
10:15 - 10:30 AM | Break | |||
10:30 AM - 12:00 PM | Ivo Dinov | Session 3: Math foundations II – Brief introduction to mathematical foundations of machine learning | Dimensionality | |
Principal Component Analysis (PCA) | ||||
High-dimensional Visualization (hands-on demos) | ||||
12:00 - 1:00 PM | Lunch Break | |||
1:00 - 2:30 PM | Kayvan Najarian | Session 4: Clustering vs Classification; k-means; k-Nearest Neighbors | Supervised & Unsupervised methods | |
k-means/Spectral/Hierarchical clustering (unsupervised) | ||||
k-NN (supervised), Naïve Bayes classification | ||||
2:30 - 2:45 PM | Break | |||
2:45 - 4:15 PM | TBA | Session 5: Introduction to Python programming | Basics of Python programming | |
Tuesday | 7:00 - 8:30 AM | Ivo Dinov | Session 6: Linear regression, logistic regression | Simple linear regression, logit modeling |
Ordinary least squares estimation | ||||
Example scenarios | ||||
Controlled feature selection (knockoff) | ||||
8:30 - 8:45 AM | Break | |||
8:45 - 10:15 AM | Kayvan Najarian | Session 7: Simple classification methods and feature analysis | Naïve Bayes classification, Feature selection and reduction | |
10:15 - 10:30 AM | Break | |||
10:30 AM - 12:00 PM | Kayvan Najarian | Session 8: Model validation and assessment | Metrics for assessment of model performance, n-fold cross validation | |
12:00 - 1:00 PM | Lunch Break | |||
1:00 - 2:30 PM | Michael Mathis | Session 9: Using machine learning for clinical and health applications I | ||
2:30 - 2:45 PM | Break | |||
2:45 - 4:15 PM | TBA | Session 10: Python programming for linear regression, logistic regression; ridge regression and Naïve Bayes | Python for applying simple machine learning methods to a clinical decision-making problem | |
Wednesday | 7:00 - 8:30 AM | Kayvan Najarian | Session 11: Artificial neural networks I | Fundamentals of artificial neural networks and their advantages/limitations |
8:30 - 8:45 AM | Break | |||
8:45 - 10:15 AM | Kayvan Najarian | Session 12: Regression trees | Classification and regression tree (CART) | |
10:15 - 10:30 AM | Break | |||
10:30 AM - 12:00 PM | Kayvan Najarian | Session 13: Random Forest | Ensemble use of regression trees for random forest and other boosting methods | |
12:00 - 1:00 PM | Lunch Break | |||
1:00 - 2:30 PM | Ryan Stidham | Session 14: Using machine learning for clinical and health applications II | ||
2:30 - 2:45 PM | Break | |||
2:45 - 4:15 PM | TBA | Session 15: Python programming for neural networks, regression trees and random forest | Python for applying CART, random forest, and neural networks to a clinical decision-making problem | |
Thursday | 7:00 - 8:30 AM | Kayvan Najarian | Session 16: Support vector machines | Using Kernel methods for support vector machines (SVM) |
8:30 - 8:45 AM | Break | |||
8:45 - 10:15 AM | Jonathan Gryak | Session 17: Deep Learning I | Deep Learning overview, appropriate uses of deep learning, convolutional neural networks, U-Net | |
10:15 - 10:30 AM | Break | |||
10:45 AM - 12:00 PM | Jonathan Gryak | Session 18: Deep Learning II | LSTM, Autoencoders | |
12:00 - 1:00 PM | Lunch Break | |||
1:00 - 2:30 PM | TBA | Session 19: Python programming for support vector machine | Python for applying SVM to a clinical decision-making problem | |
2:30 - 2:45 PM | Break | |||
2:45-4:15 PM | TBA | Session 20: Python programming for deep learning | Python for applying deep learning models to a clinical decision-making problem | |
7:00-8:30 AM | ||||
Friday | Kayvan Najarian | Session 21: Strategies to add a data science flavor to health-related projects and grant proposals | Some general tips on how to integrate data Scientific ideas in primarily clinical/biomedical grant proposals | |
8:30-8:45 AM | Break | |||
8:45 - 10:15 AM | Michael Sjoding | Session 22: Using machine learning for clinical and health applications III | ||
10:15-10:30 AM | Break | |||
10:30 AM - 12:00 PM | Nambi Nallasamy | Session 23: Using machine learning for clinical and health applications IV | ||
12:00-1:00 PM | Lunch Break | |||
1:00 - 2:30 PM | Michael Mathis | Session 24: Guidelines on using machine learning for clinical applications | ||
2:30 - 2:45 PM | Break | |||
2:45 - 4:15 PM | Ivo Dinov, Jonathan Gryak, Michael Mathis, Kayvan Najarian Nambi Nallasamy, and Michael Sjoding | Session 25: Wrap-up | Q&A; plans for follow-up sessions during the coming year |
Capstone Project
Interactive-learning (open-ended) project using a large Autism data tensor (n=1,098; k=2,145). Use the RMD source, the example HTML output, and the provided data to experiment with some of the DSPA techniques. Think of ways to augment these data (e.g., expand the time range and increase the feature richness).
Additional Resources
- DSPA Wikipedia.
- DSPA Appendices: Bayesian Simulation, Modeling and Inference » Information-Theoretic Foundation of Statistical Learning » Surface, Shape, and Manifold Representation and Visualization » Power Analysis in Experimental Design » Database SQL/NoSQL Queries & Google BigQuery » Image Convolution, Filtering, & Fourier Transform » Causality, Transfer Entropy, & Mechanistic Effects » Agent-based Reinforcement Learning.
- DSPA Springer Page & SpringerLink (PDF Download).
- dspa.predictive.space & DSPA MOOC Canvas Site.
Translate this page: