# In a multilevel linear regression, how does the reference level affect other levels/factors and which reference level ought to be selected?

by Matthew   Last Updated October 10, 2019 00:19 AM

In the diagram, Heavy smoker is the reference level as it is not shown with summary. How, what other categorical level should be used instead? Why?

``````library("MASS")     #data

survey              #data

surveyNA <- survey[complete.cases(survey),]
surveyNA

boxplot(Height~Smoke, data = surveyNA)
points(1:4,tapply(surveyNA$$Height,surveyNA$$Smoke,mean,na.rm = TRUE), pch = 4)

survfit3 <- lm(Height~Smoke, data = surveyNA)
summary(survfit3)
``````

Tags :