SOCR News ISI WSC DSPA Training 2021
SOCR News & Events: 2021 ISI/WSC Training and Education Bootcamp on Data Science and Predictive Analytics (DSPA)
- Date/Time: Wednesday & Thursday, June 16-17, 2021, 14.00-17.00, Central European Summer Time, CEST (UTC+2), 8:00-11:00 AM US-EDT.
- Registration: Registration Link, moderate registration fees apply.
- GoToMeeting: Webinar link.
- URL: Official ISI/WSC Course Website.
- Conference: 2021 ISI World Statistical Congress and WSC 2021 short courses.
- Session Format: Two daily sessions (3-hours each).
- Session URL: https://myumi.ch/erXm2.
This course will be based on a Data Science and Predictive Analytics (DSPA) course I teach at the University of Michigan. The training will provide intermediate to advanced learners with a solid data science foundation to address challenges related to collecting, managing, processing, interrogating, analyzing and interpreting complex health and biomedical datasets using R. Participants will gain skills and acquire a tool-chest of methods, software tools, and protocols that can be applied to a broad spectrum of Big Data problems.
|Algorithms and Applications||Tools||Working knowledge of basic software tools (command-line, GUI based, or web-services)|
|Data Management||Data validation & visualization||Curation, Exploratory Data Analysis (EDA) and visualization|
|Data wrangling||Skills for data normalization, data cleaning, data aggregation, and data harmonization/registration|
|Data infrastructure||Handling databases, web-services, Hadoop, multi-source data||Data structures, SOAP protocols, ontologies, XML, JSON, streaming|
|Analysis Methods||Statistical inference|
|Study design and diagnostics|
- Foundations of R
- Managing Data in R
- Data Visualization
- Linear Algebra & Matrix Computing
- Dimensionality Reduction
- Lazy Learning: Classification Using Nearest Neighbors
- Probabilistic Learning: Classification Using Naive Bayes
- Decision Tree Divide and Conquer Classification
- Forecasting Numeric Data Using Regression Models
- Black Box Machine-Learning Methods: Neural Networks and Support Vector Machines
- Apriori Association Rules Learning
- k-Means Clustering
- Model Performance Assessment
- Improving Model Performance
- Specialized Machine Learning Topics
- Variable/Feature Selection
- Regularized Linear Modeling and Controlled Variable Selection
- Big Longitudinal Data Analysis
- Natural Language Processing/Text Mining
- Prediction and Internal Statistical Cross Validation
- Function Optimization
- Deep Learning, Neural Networks
- Welcome and introductions
- Data manipulation and visualization
- Non-linear dimensionality reduction (UMAP & t-SNE)
- Reticulation (Interoperability between R, Python, C/C++ and other languages)
- Role of optimization in AI/ML
- Activities and HTML5 demos.
|Wednesday, June 16, 2021, 8:00-11:00 AM US-EDT||Thursday, June 17, 2021, , 8:00-11:00 AM US-EDT|
|Welcome||Review of Day 1|
|DSPA Summer Course Overview (ISI/WSC, prereqs, vision, objectives, outcomes, Website)||Questions, comments, issues?|
|Introductions (Instructor: Ivo Dinov; Attendees: please post in Chat/Discussion-Forum: Participant?s Name, Affiliation, Title, interests, and ?one fun fact about you?||Supervised AI|
|Expectations and optional capstone project (below)||Baseball players physique modeling|
|k-NN prediction of galaxy spin|
|Open Science It’s online, therefore it exists!||Model-free|
|Download DSPA Textbook (free)||Estimate the square root function using NN|
|Resource Search & Navigation, Language Translations|
|NN Google Trends and the Stock Market|
|Motivation - and 7D of Big Data||Unsupervised AI|
|Digitalization of all human experiences||Classification and clustering (k-Means, spectral, hierarchical)|
|Responsible Data Science/Ethical Predictive Analytics||Hot-dogs example|
|R vs. Python vs. SAS vs. SPSS vs. other SW||Silhouette plots|
|Confirm local installations of R & RStudio||Pediatric trauma clustering study|
|Example Demo (requires knitr package)|
|Chapter 4 RMD Source, HTML output, SOCR_Header|
|5-min Break||5-min Break|
|Reticulation (interoperability between R, Python, C/C++ and other languages)|
|Data manipulation import/export, EM imputation, webpage scraping, sample statistics (moments)||Text modeling & NLP (sentiment analysis example)|
|Probability Distributions: Distributome, TVN Webapp||Longitudinal data analysis (Google trends analytics)|
|Linear PCA: 2D --> 1D example, PPMI (Parkinson's disease) example|
|5-min Break||5-min Break|
|Non-linear: MNIST data OCR: UMAP OCR, t-SNE OCR||Role of optimization in AI/ML (Healthcare manufacturer product optimization example)|
|SOCR/Tensorboard/Projector UKBB Brain Study||Deep neural networks (image-classification example)|
|Demonstrations of interesting Capstone project results|
|Open discussion||Open discussion|
- Course Flyer.
- 1-page Course Coverage with dynamic links to content.
- DSPA Wikipedia.
- DSPA Springer Page & SpringerLink (PDF Download).
- dspa.predictive.space & DSPA MOOC Canvas Site.
Translate this page: