Project instructions:
. The statistical analysis can be carried out in SPSS or EXCEL (alternatively you can mixed and match using both packages if you wish). Your answers, including all relevant statistical analysis and output, should be provided in a single MS Word document. The notational word limit is 1,500 words but likely to be considerably less in practice.
The deadline for submission is: 3pm Wednesday 2 April 2014
(The work is to be submitted electronically to the LTS Hub)
The bhps17 dataset (which is available in both SPSS and EXCEL file formats) provides information about UK individuals and households who were surveyed in 2007 and 2008 (this dataset was previously considered in Seminars 4 & 5). The datasets and the list of variables, together with how they are defined and coded, can be found in the assessment folder on the Blackboard site for the module. Use the bhps17 dataset to answer the questions that follow.
Some General Points:
? Please read through the questions and follow the instructions careful. Plan what you intend to do BEFORE sitting down at the computer. Your time at the computer will be more productive if you know what you plan to do beforehand.
? Make sure all missing values for the variables you are considering have been dealt with before doing the statistical analysis.
? Make sure the statistical analysis you undertake is appropriate for the variable(s) you are considering (e.g. think about the measurement level of the variable(s) you are analysing).
? The statistical analysis can be done in either Excel or SPSS (or you can use both) but the output and discussions should be contained in a single file (preferably a Word document; pdf is also acceptable).
? Presentation matters. Make sure graphs and tables are clearly labelled. Use font size no smaller than 11 Times New Roman, Arial or Calibri.
? The coursework draws upon the skills you will have learnt in the computer-based seminars and the material presented in the lectures.
? This is an individual assignment, be aware of UEA policies on plagiarism and collusion.
? You are not required to include references and a bibliography in this assignment.
? You are welcome to use Blikbook to post questions and queries relating to the coursework.
Question 1
This question involves conducting statistical analysis associated with happiness (also referred to as life satisfaction or well-being)
(i) Identify the variable in the bhps17 dataset that could be used to as a measure of happiness (life satisfaction)?
(ii) With the aid of appropriate graphical and numerical methods describe the distribution of the happiness variable. (Make sure all missing values have been removed before conducting the statistical analysis.)
(iii) Is there evidence to suggest the mean level of happiness is significantly different from 5?
(iv) It has been claimed that commuting affects happiness, see for example:
Use the bhps17 dataset to establish the extent to which happiness is influenced by the way people travel to work.
(v) How do the results of your analysis compare to results found in the above report? Provide one or two possible explanations for any differences observed.
Question 2
This question involves a statistical analysis of savings behaviour.
(i) In the bhps17 dataset there are two variables that describe the savings behaviour of individuals: whether the individual saves and the amount saved.
Why is it important to know how ?whether the individual saves? variable is coded when it comes to addressing the missing values associated ?the amount saved? variable?
(ii) Restrict the sample to those individuals who save a positive amount. What is the sample size? Is there evidence to suggest mean savings is significantly greater than ?150 per month?
(iii) Using a suitable measure of income (again making sure missing values have been dealt with), construct a scatterplot and then estimate a regression equation to assess whether there is evidence that income is positively related to savings? How strong is the relationship?
Interpret the regression coefficients, coefficient of determination and undertake appropriate hypothesis tests.
(iv) Is there any evidence to suggest there are outliers in the scatterplot constructed in part (iii)? Undertake appropriate statistical analysis to identify and assess the impact of outliers.
(v) Is there evidence to suggest there are lurking variables in the scatterplot? Undertake appropriate statistical analysis to establish the impact of one potential lurking variable on the relationship identified in part (iii).