Difference between revisions of "SMHS LinearModeling MachineLearning"
| Line 155: | Line 155: | ||
|} | |} | ||
Number of obs: 80, groups: Subject, 28 </center> | Number of obs: 80, groups: Subject, 28 </center> | ||
| + | |||
| + | |||
| Line 174: | Line 176: | ||
|} | |} | ||
| − | + | </center> | |
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| Line 205: | Line 182: | ||
| + | <center> | ||
| + | {| class="wikitable" style="text-align:center; width:35%" border="1" | ||
| + | |- | ||
| + | |Index||Subject||Day||Treatment||Obs | ||
| + | |- | ||
| + | |1||13||Day1||B||6.472687 | ||
| + | |- | ||
| + | |2||14||Day1||B||7.01711 | ||
| + | |- | ||
| + | |3||15||Day1||B||6.200715 | ||
| + | |- | ||
| + | |4||16||Day1||B||6.613928 | ||
| + | |- | ||
| + | |5||17||Day1||A||6.829968 | ||
| + | |- | ||
| + | |6||18||Day1||A||7.387583 | ||
| + | |- | ||
| + | |7||19||Day1||A||7.367293 | ||
| + | |- | ||
| + | |8||20||Day1||A||8.018853 | ||
| + | |- | ||
| + | |9||21||Day1||C||7.527408 | ||
| + | |- | ||
| + | |10||22||Day1||C||6.746739 | ||
| + | |- | ||
| + | |11||23||Day1||C||7.29691 | ||
| + | |- | ||
| + | |12||24||Day1||C||6.98336 | ||
| + | |- | ||
| + | |13||29||Day1||B||6.816621 | ||
| + | |- | ||
| + | |14||30||Day1||B||6.571689 | ||
| + | |- | ||
| + | |15||31||Day1||B||5.911261 | ||
| + | |- | ||
| + | |16||32||Day1||B||6.954988 | ||
| + | |- | ||
| + | |17||33||Day1||C||7.624122 | ||
| + | |- | ||
| + | |18||34||Day1||C||7.669865 | ||
| + | |- | ||
| + | |19||35||Day1||C||7.676225 | ||
| + | |- | ||
| + | |20||36||Day1||C||7.263593 | ||
| + | |- | ||
| + | |21||37||Day1||A||7.704737 | ||
| + | |- | ||
| + | |22||38||Day1||A||7.328716 | ||
| + | |- | ||
| + | |23||39||Day1||A||7.29561 | ||
| + | |- | ||
| + | |24||40||Day1||A||5.96418 | ||
| + | |- | ||
| + | |25||62||Day1||A||6.880814 | ||
| + | |- | ||
| + | |26||63||Day1||A||6.926342 | ||
| + | |- | ||
| + | |27||64||Day1||A||6.926342 | ||
| + | |- | ||
| + | |28||65||Day1||A||7.562293 | ||
| + | |- | ||
| + | |29||13||Day3||B||6.677607 | ||
| + | |- | ||
| + | |30||14||Day3||B||7.023526 | ||
| + | |- | ||
| + | |31||15||Day3||B||6.441864 | ||
| + | |- | ||
| + | |32||16||Day3||B||7.020875 | ||
| + | |- | ||
| + | |33||17||Day3||A||7.478931 | ||
| + | |- | ||
| + | |34||18||Day3||A||7.495336 | ||
| + | |- | ||
| + | |35||19||Day3||A||7.427709 | ||
| + | |- | ||
| + | |36||20||Day3||A||7.63302 | ||
| + | |- | ||
| + | |37||21||Day3||C||7.382091 | ||
| + | |- | ||
| + | |38||22||Day3||C||7.359731 | ||
| + | |- | ||
| + | |39||23||Day3||C||7.285889 | ||
| + | |- | ||
| + | |40||24||Day3||C||7.496863 | ||
| + | |- | ||
| + | |41||29||Day3||B||6.632403 | ||
| + | |- | ||
| + | |42||30||Day3||B||6.171196 | ||
| + | |- | ||
| + | |43||31||Day3||B||6.306012 | ||
| + | |- | ||
| + | |44||32||Day3||B||7.253833 | ||
| + | |- | ||
| + | |45||33||Day3||C||7.594852 | ||
| + | |- | ||
| + | |46||34||Day3||C||6.915225 | ||
| + | |- | ||
| + | |47||35||Day3||C||7.220147 | ||
| + | |- | ||
| + | |48||36||Day3||C||7.298227 | ||
| + | |- | ||
| + | |49||37||Day3||A||7.573612 | ||
| + | |- | ||
| + | |50||38||Day3||A||7.36655 | ||
| + | |- | ||
| + | |51||39||Day3||A||7.560513 | ||
| + | |- | ||
| + | |52||40||Day3||A||7.289078 | ||
| + | |- | ||
| + | |53||62||Day3||A||7.287802 | ||
| + | |- | ||
| + | |54||63||Day3||A||7.155336 | ||
| + | |- | ||
| + | |55||64||Day3||A||7.394452 | ||
| + | |- | ||
| + | |56||65||Day3||A||7.465383 | ||
| + | |- | ||
| + | |57||13||Day6||B||6.976048 | ||
| + | |- | ||
| + | |58||14||Day6||B||7.222966 | ||
| + | |- | ||
| + | |59||15||Day6||B||6.584153 | ||
| + | |- | ||
| + | |60||16||Day6||B||7.013223 | ||
| + | |- | ||
| + | |61||17||Day6||A||7.569905 | ||
| + | |- | ||
| + | |62||18||Day6||A||7.459185 | ||
| + | |- | ||
| + | |63||19||Day6||A||7.504068 | ||
| + | |- | ||
| + | |64||20||Day6||A||7.801867 | ||
| + | |- | ||
| + | |65||21||Day6||C||7.598728 | ||
| + | |- | ||
| + | |66||22||Day6||C||7.475841 | ||
| + | |- | ||
| + | |67||23||Day6||C||7.511873 | ||
| + | |- | ||
| + | |68||24||Day6||C||7.518384 | ||
| + | |- | ||
| + | |69||29||Day6||B||6.618589 | ||
| + | |- | ||
| + | |70||30||Day6||B||5.854754 | ||
| + | |- | ||
| + | |71||31||Day6||B||6.125749 | ||
| + | |- | ||
| + | |72||32||Day6||B||6.96272 | ||
| + | |- | ||
| + | |73||33||Day6||C||7.5406 | ||
| + | |- | ||
| + | |74||34||Day6||C||7.379861 | ||
| + | |- | ||
| + | |75||35||Day6||C||7.344189 | ||
| + | |- | ||
| + | |76||36||Day6||C||7.362815 | ||
| + | |- | ||
| + | |77||37||Day6||A||7.805802 | ||
| + | |- | ||
| + | |78||38||Day6||A||7.764172 | ||
| + | |- | ||
| + | |79||39||Day6||A||7.789844 | ||
| + | |- | ||
| + | |80||40||Day6||A||7.616437 | ||
| + | |- | ||
| + | |81||62||Day6||A||NA | ||
| + | |- | ||
| + | |82||63||Day6||A||NA | ||
| + | |- | ||
| + | |83||64||Day6||A||NA | ||
| + | |- | ||
| + | |84||65||Day6||A||NA | ||
| + | |} | ||
| + | </center> | ||
.... | .... | ||
Revision as of 10:36, 4 March 2016
SMHS Linear Modeling - Machine Learning Algorithms
Scientific inference based on fixed and random effect models, assumptions, and mixed effects logistic regression.
Questions:
- How can we tie human intuition and computer-generated results to obtain reliable, effective, and efficient decision-support system (that facilitates, forecasting)?
- Niels Born – “It is difficult to make predictions, especially about the future” …
- Can we unsupervisely classify the data?
Prediction
For most of the machine learning algorithms (including first-order linear regression), we:
- first generate the model using training data, and then
- predict values for test/new data.
Predictions are made using the R predict function. (type ?predict.name), where name is the function-name corresponding to the algorithm. The first argument of predict often represents the variable storing the model and the second argument is a matrix or data frame of test data that the model needs to be applied to. Calling predict can be done in 2 ways: type predict or type of predict.name.
Example:
#mydata <- read.table('https://umich.instructure.com/files/330381/download?download_frd=1&verifier=HpfmjfMFaMsk7rIpfPx0tmz960oTW7JA8ZonGvVC',as.is=T, header=T) # 01a_data.txt
# mydata <- read.table('data.txt',as.is=T, header=T)
# (1) First, there are different approaches to split the data (partition the data) into # training and testing sets. ## TRAINING: 75% of the sample size sample_size <- floor(0.75 * nrow(mydata)) ## set the seed to make your partition reproductible set.seed(1234) train_ind <- sample(seq_len(nrow(mydata)), size = sample_size) train <- mydata[train_ind, ]
# TESTING DATA test <- mydata[-train_ind, ]
lin.mod <- lm(Weight ~ Height*Team, data=train) predicted.values <- predict(lin.mod, newdata=test
Data Modeling/Training
Logistic Regression:
glm_model <-glm(ifelse(Weight > 200,1,0) ~ Height*Team, family=binomial(link="logit"), data=train)
K-Means Clustering
train.1 <- cbind(train$\$$Height, train$\$$Weight, train$\$$Age)
test.1 <- cbind(test$\$$Height, test$\$$Weight, test$\$$Age)
Weight.1 <- ifelse(train$\$$Weight > 200,1,0)
head(train.1)
kmeans_model <- kmeans(<u><b>train.1</b></u>, 3)
plot(train.1, col = kmeans_model$\$$cluster)
points(kmeans_model$\$$centers, col = 1:2, pch = 8, cex = 2)
<b>K-Nearest Neighbor Classification</b>
# install.packages("class")
library("class")
knn_model <- knn(train=train.1, test=test.1, cl=as.factor(Weight.1), k=5)
plot(knn_model)
summary(knn_model)
<b>Naïve Bayes Classifier</b>
install.packages("e1071")
library("e1071")
nbc_model <- naiveBayes(Weight ~ Height*Age, data=train.1)
<b>Decision Trees (CART)</b>
#install.packages("e1071")
library("rpart")
cart_model <- rpart(Weight ~ Height+Age, data= as.data.frame(train.1), method="class")
plot(cart_model)
text(cart_model)
<b>AdaBoost</b>
install.packages("ada")
# X be the matrix of features, and labels be a vector of 0-1 class labels.
library("ada")
boost_model <- ada(x= cbind(train$\$$Height, train$\$$Weight, train$\$$Age), y= Weight.1)
plot(boost_model)
boost_model
Support Vector Machines (SVM)
#install.packages("e1071")
library("rpart")
svm_model <- svm(x= cbind(train$\$$Height, train$\$$Weight, train$\$$Age), y=as.factor(Weight.1),
kernel ="radial")
summary(svm_model)
Appendix
Example 1: Simulation (subject, day, treatment, observation)
Obs ~ Treatment + Day + Subject(Treatment)+ Day*Subject(Treatment)+ ε.
This model is accounts for:
Response = Obs
Fixed effects:
Treatment (fixed)
Day (fixed)
Treatment*Day interaction
Random Effects:
Subject nested within Treatment (random)
Day crossed with "Subject within Treatment" (random)
mydata <- data.frame(
Subject = c(13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 29, 30, 31, 32, 33,
34, 35, 36, 37, 38, 39, 40, 62, 63, 64, 65, 13, 14, 15, 16, 17, 18,
19, 20, 21, 22, 23, 24, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,
40, 62, 63, 64, 65, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 62, 63, 64, 65),
Day = c(rep(c("Day1", "Day3", "Day6"), each=28)),
Treatment = c(rep(c("B", "A", "C", "B", "C", "A", "A", "B", "A", "C", "B", "C",
"A", "A", "B", "A", "C", "B", "C", "A", "A"), each = 4)),
Obs = c(6.472687, 7.017110, 6.200715, 6.613928, 6.829968, 7.387583, 7.367293,
8.018853, 7.527408, 6.746739, 7.296910, 6.983360, 6.816621, 6.571689,
5.911261, 6.954988, 7.624122, 7.669865, 7.676225, 7.263593, 7.704737,
7.328716, 7.295610, 5.964180, 6.880814, 6.926342, 6.926342, 7.562293,
6.677607, 7.023526, 6.441864, 7.020875, 7.478931, 7.495336, 7.427709,
7.633020, 7.382091, 7.359731, 7.285889, 7.496863, 6.632403, 6.171196,
6.306012, 7.253833, 7.594852, 6.915225, 7.220147, 7.298227, 7.573612,
7.366550, 7.560513, 7.289078, 7.287802, 7.155336, 7.394452, 7.465383,
6.976048, 7.222966, 6.584153, 7.013223, 7.569905, 7.459185, 7.504068,
7.801867, 7.598728, 7.475841, 7.511873, 7.518384, 6.618589, 5.854754,
6.125749, 6.962720, 7.540600, 7.379861, 7.344189, 7.362815, 7.805802,
7.764172, 7.789844, 7.616437, NA, NA, NA, NA))
install.packages("lme4")
library("lme4", lib.loc="~/R/win-library/3.1")
m1 <- lmer(Obs ~ Treatment * Day + (1 | Subject), mydata)
m1
Linear mixed model fit by REML ['lmerMod']
Formula: Obs ~ Treatment * Day + (1 | Subject)
Data: mydata
REML criterion at convergence: 56.8669
| Groups | Name | Std. Dev. |
| Subject | (Intercept) | 0.2163 |
| Residual | 0.2602 |
| (Intercept) | TreatmentB | TreatmentC |
| 7.1827 | -0.6129 | 0.1658 |
| DayDay3 | DayDay6 | TreatmentB: DayDay3 |
| 0.2446 | 0.4507 | -0.1235 |
| TreatmentC: DayDay3 | TreatmentB: DayDay6 | TreatmentC: DayDay6 |
| Index | Subject | Day | Treatment | Obs |
| 1 | 13 | Day1 | B | 6.472687 |
| 2 | 14 | Day1 | B | 7.01711 |
| 3 | 15 | Day1 | B | 6.200715 |
| 4 | 16 | Day1 | B | 6.613928 |
| 5 | 17 | Day1 | A | 6.829968 |
| 6 | 18 | Day1 | A | 7.387583 |
| 7 | 19 | Day1 | A | 7.367293 |
| 8 | 20 | Day1 | A | 8.018853 |
| 9 | 21 | Day1 | C | 7.527408 |
| 10 | 22 | Day1 | C | 6.746739 |
| 11 | 23 | Day1 | C | 7.29691 |
| 12 | 24 | Day1 | C | 6.98336 |
| 13 | 29 | Day1 | B | 6.816621 |
| 14 | 30 | Day1 | B | 6.571689 |
| 15 | 31 | Day1 | B | 5.911261 |
| 16 | 32 | Day1 | B | 6.954988 |
| 17 | 33 | Day1 | C | 7.624122 |
| 18 | 34 | Day1 | C | 7.669865 |
| 19 | 35 | Day1 | C | 7.676225 |
| 20 | 36 | Day1 | C | 7.263593 |
| 21 | 37 | Day1 | A | 7.704737 |
| 22 | 38 | Day1 | A | 7.328716 |
| 23 | 39 | Day1 | A | 7.29561 |
| 24 | 40 | Day1 | A | 5.96418 |
| 25 | 62 | Day1 | A | 6.880814 |
| 26 | 63 | Day1 | A | 6.926342 |
| 27 | 64 | Day1 | A | 6.926342 |
| 28 | 65 | Day1 | A | 7.562293 |
| 29 | 13 | Day3 | B | 6.677607 |
| 30 | 14 | Day3 | B | 7.023526 |
| 31 | 15 | Day3 | B | 6.441864 |
| 32 | 16 | Day3 | B | 7.020875 |
| 33 | 17 | Day3 | A | 7.478931 |
| 34 | 18 | Day3 | A | 7.495336 |
| 35 | 19 | Day3 | A | 7.427709 |
| 36 | 20 | Day3 | A | 7.63302 |
| 37 | 21 | Day3 | C | 7.382091 |
| 38 | 22 | Day3 | C | 7.359731 |
| 39 | 23 | Day3 | C | 7.285889 |
| 40 | 24 | Day3 | C | 7.496863 |
| 41 | 29 | Day3 | B | 6.632403 |
| 42 | 30 | Day3 | B | 6.171196 |
| 43 | 31 | Day3 | B | 6.306012 |
| 44 | 32 | Day3 | B | 7.253833 |
| 45 | 33 | Day3 | C | 7.594852 |
| 46 | 34 | Day3 | C | 6.915225 |
| 47 | 35 | Day3 | C | 7.220147 |
| 48 | 36 | Day3 | C | 7.298227 |
| 49 | 37 | Day3 | A | 7.573612 |
| 50 | 38 | Day3 | A | 7.36655 |
| 51 | 39 | Day3 | A | 7.560513 |
| 52 | 40 | Day3 | A | 7.289078 |
| 53 | 62 | Day3 | A | 7.287802 |
| 54 | 63 | Day3 | A | 7.155336 |
| 55 | 64 | Day3 | A | 7.394452 |
| 56 | 65 | Day3 | A | 7.465383 |
| 57 | 13 | Day6 | B | 6.976048 |
| 58 | 14 | Day6 | B | 7.222966 |
| 59 | 15 | Day6 | B | 6.584153 |
| 60 | 16 | Day6 | B | 7.013223 |
| 61 | 17 | Day6 | A | 7.569905 |
| 62 | 18 | Day6 | A | 7.459185 |
| 63 | 19 | Day6 | A | 7.504068 |
| 64 | 20 | Day6 | A | 7.801867 |
| 65 | 21 | Day6 | C | 7.598728 |
| 66 | 22 | Day6 | C | 7.475841 |
| 67 | 23 | Day6 | C | 7.511873 |
| 68 | 24 | Day6 | C | 7.518384 |
| 69 | 29 | Day6 | B | 6.618589 |
| 70 | 30 | Day6 | B | 5.854754 |
| 71 | 31 | Day6 | B | 6.125749 |
| 72 | 32 | Day6 | B | 6.96272 |
| 73 | 33 | Day6 | C | 7.5406 |
| 74 | 34 | Day6 | C | 7.379861 |
| 75 | 35 | Day6 | C | 7.344189 |
| 76 | 36 | Day6 | C | 7.362815 |
| 77 | 37 | Day6 | A | 7.805802 |
| 78 | 38 | Day6 | A | 7.764172 |
| 79 | 39 | Day6 | A | 7.789844 |
| 80 | 40 | Day6 | A | 7.616437 |
| 81 | 62 | Day6 | A | NA |
| 82 | 63 | Day6 | A | NA |
| 83 | 64 | Day6 | A | NA |
| 84 | 65 | Day6 | A | NA |
....
- SOCR Home page: http://www.socr.umich.edu
Translate this page: