###Code for Lecture 2 ### how to build linear models ### start with simple regression datum=read.csv(file.choose()) ### first import some data head(datum) ### see what we imported plot(datum$Elevation,datum$Proportion) ### plot the relationship results=lm(Proportion~Elevation,data=datum) ### runs a simple linear regression summary(results) ### spits out the analysis results ### just shows you the object 'results' ### note individual coefficient estimates and constant as well of t-tests of estimates ### note F-statistic of full model - same as t-test in single variable regression anova(results) ### generate an anova table of regression - not very useful as doesn't give anything other than p coef(results) ### shows you the coefficient estimates from 'results' residual(results) ### shows you the residuals from the regression fit of 'results' resid(results) ### also shows you the residuals from the regression fit of 'results' plot(resid(results)) ### plots the residuals in case order (order in which encountered) plot(datum$x,resid(results)) ### plots the residuals in the order of x ### run a simple anova plot(datum$PreyDens,datum$Proportion) results=lm(Proportion~PreyDens,data=datum) summary(results) anova(results) ### Note that output just tells you difference between one group and reference ### Doesn't tell you difference between two groups listed in output ### to test for differences in groups, must combine them ### in order to combine must use dummy coded variables results=lm(Proportion~Low+Medium+High),data=datum) summary(results) anova(results) ###note that anova now gives significance of each group rather than group as a whole results2=lm(Proportion~Low+I(Medium+High),data=datum) # combine groups in dummy coding; NOTE THAT PLUS DOESN'T REALLY MEAN ADD, JUST MEANS INCLUDE BOTH IN MODEL summary(results2) anova(results2,results) ### compares simpler model (combined groups) to more complex model ### if p-value is significant, then complex model is better, groups are different ### Also can remove any groups that are not different from the reference results3=lm(Proportion~I(Medium+High),data=datum) summary(results3) anova(results3,results) ### general linear modeling is a multi-variable procedure, analyzes all variables at once ### reasons to do 'multiple regression' rather than univariate analyses: ### faster, more efficient ### accounts for collinearity among variables ### allows one to test for interactions ### allows one to account for pseudoreplication, random effects, and nested designs results4=lm(Proportion~Elevation+Low+Medium+High,data=datum) #NOTE THAT PLUS DOESN'T REALLY MEAN ADD, JUST MEANS INCLUDE BOTH IN MODEL summary(results4) anova(results4)