project analysis

| December 19, 2015

In your final project you will have to analyze a dataset. The dataset contains a number of explanatory variables (some qualitative) and a

response variable (you may need to identify what your response variable is). Your goal is to find an appropriate model. That means you

need to identify the variables that are significant and can be used for future predictions. There are a number of model selection methods

and different methods will result in different “best” model. You need to use those methods and come up with a model that is parsimonious

but has a good predictive power. You also need to do a residual analysis to see if all the model assumptions are valid, to see if there

are outliers and influential observations. You need to check for multicollinearity. A good model should not have multicollinearity

You need to submit a report. In your report you should describe the dataset, your goal, your findings and what you think about the “best”

model you selected (how good it is, what kind of problems you see with it etc.). Include relevant plots but report should not contain any

SAS output. You should submit your SAS code and relevant SAS output as appendix or supplemental material.

In the Rut Depth project, the goal is to find whether viscosity, surface, base, run, fines and voids are significant predication of rut

depth. Viscosity, surface, base, run, fines and voids are the indicator variables. And the response variable is the rut depth in this

In this project, there exist six indicator variables. One of them is the qualitative variable: run. If run equals to 1, then the first

group equals to 1, otherwise the first group equals to 0. If run equals to 0, then the second group equals to 1, else the second group

equals to 0. Other five variables (Viscosity, surface, base, run, fines and voids) are the quantitative variables.
Because we did not know the relationship between Y and different X’s very clear, we need to use the transformation like y = log(rutdepth)

to get a direct and clear scatterplot, which is able to show the relationship between indicator variables and response variable.

Category: Essay

About the Author (Author Profile)