Any topic (writer’s choice)
QBM117 Business Statistics
Assignment 1
Due date: 3 August 2014
Value: 10%
Rationale
Assignment 1 is designed to assess the following learning outcomes:
be able to explain the standard uses of Statistics in the media and in business
environments,and judge whether the statistical methodology and conclusions
drawn are appropriate.
be able to summarise and interpret data graphically and numerically.
be able to use a statistical package to analyse data appropriately, and then interpret
the output.
Presentation
The assignment must be neatly handwritten with any Excel output inserted where
required at the appropriate place in the assignment not in an Appendix at the back of the
assignment. Justification for requiring the assignment to be hand written is provided in
the subject outline. Use the template provided in the assignment folder in resources when
preparing this assignment.
If using EASTS, you must submit ONE file per assignment and that file must be PDF or
DOC or DOCX. Excel files are not permitted. EASTS cannot print them.
Question 1 (24 marks)
a. A consumer group asked a random sample of 20 drivers to keep a record of the
number of discount petrol vouchers used by each driver over a six month period.
The data showing the number of vouchers used follows.
26 24 12 15 8 4 6 15 18 2 7 5 0 3 5 17 10 5 9 12
For these data, calculate
i. the mean number of vouchers used.
ii. the standard deviation number of vouchers used.
iii. the median number of vouchers used.
iv. the range of the number of vouchers used.
Use the statistics functions on your calculator for parts i. and ii. above. Do not use
the formulae provided in the text.
(4 marks)
b. The opening daily share price for the Commonwealth Bank for the period 3/3/14
to 2/7/14 inclusive averaged $78.84 with a standard deviation of $2.45. The
opening daily share price in Origin Energy for the same period averaged $14.67
with a standard deviation of $0.35.
i. Calculate the variability of the daily share price for each company by
calculating the coefficient of variation.
ii. Based on the coefficient of variation calculated in part i., which is the
riskier company to invest in? Explain.
iii. Why is the coefficient of variation a better measure of risk than the
standard deviation in this case?
(6 marks)
c. The boxplot which follows shows the daily closing prices for Commonwealth
Bank shares for the perod 10/4/14 to 2/7/14 inclusive.
76 77 78 79 80 81 82 83
price per share ($)
Boxplot showing closing prices for Commonwealth
Bank shares 10/4/14 to 2/7/14
BoxPlot
Use the boxplot above to answer the following questions.
i. Would the mean closing price over this period be higher or lower than the
median? Justify your answer. Calculations are not necessary.
ii. Estimate the lowest closing price to the nearest 50 cents. Is this an outlier?
Justify your answer.
iii. Complete this sentence. Fifty percent of the closing prices are between
_________ and _________. Express each amount to the nearest 50 cents.
(7 marks)
d. The weekly income for a sample of seventy randomly selected petrol stations in
NSW was recorded in a given week. The data were summarised and presented in
the table which follows.
Weekly income ($’000s) Frequency
> 20 up to and including 40 6
> 40 up to and including 60 9
> 60 up to and including 80 25
> 80 up to and including 100 20
> 100 up to and including 120 8
> 120 up to and including 140 2
Use the statistics functions on your calculator to determine the approximate
mean and standard deviation weekly income for petrol stations in NSW.
Do not use the formulae provided in the text.
(3 marks)
e. Identify whether the following proposed study is an example of descriptive
statistics or inferential statistics and justify your choice.
Estimating the true proportion of households in Canberra that have at least three
dependents, from a random sample of 100 Canberra households.
(2 marks)
f. Data was collected on the incomes of all employees in a large company. What is
the most likely shape of the distribution of these incomes? Justify your choice.
(2 marks)
Question 2 (62 marks)
Download the data set ‘auction data.xls’ from the Assignment folder in the resources
section of Interact. The data provided in auction data.xls show the Sydney auction results
for the week ending 22 June 2014. The variables in this data set are: Beds, Type, Price
and Result representing the number of bedrooms, the type of property (house or unit), the
selling price (if sold) and the result of the auction respectively, as well as the auction date
and the name of the selling agent.
a. For each of the variables in the list below, identify the type of data recorded. State
whether it is quantitative or qualitative and include the level of the data (nominal,
ordinal, interval or ratio).
i. Beds
ii. Type
iii. Price
(6 marks)
b. Most real data sets you encounter will contain errors. This one is no exception.
Read the document ‘working with real data sets.pdf’ which explains how to
identify errors in a data set and how to deal with them before going any
further.
List any four different types of errors you have found in this data set and explain
why for each one why you have decided it was an error or possible error. For
example do not list four entries which all had the price missing.
You may want to leave completing this question till later. As you work with the
data set you will encounter some of these errors so just make a note of them as
you find them.
Since we cannot contact the real estate agents to follow up on these errors, for the
purpose of this assignment we will work with the data set as best we can.
(4 marks)
c. Sharon and Mark are property owners in the Sydney region who are planning to
sell their two bedroom unit over the next few months. They are considering
putting it up for auction so are interested in using these data to gain an insight into
the current Sydney auction market.
Using the complete data set, generate a three way pivot table report of ‘beds’ by
‘type’ by ‘result’. Use ‘type’ and ‘beds’ as row labels.
Include the table as part of the submitted assignment.
Use the data in the pivot table to answer the following vendors’ questions about
the properties listed for auction in Sydney for the week ending 22 June 2014.
i. How many properties were originally listed for auction for the week in
question? How many of these were units?
ii How many houses were withdrawn from sale? How many units were
withdrawn from sale?
iii How many 2 bedroom units were passed in? Express this as a percentage
of all the units listed for auction that week excluding all units that were
withdrawn from sale.
iv. How many 2 bedroom units were sold at auction that week? How many 2
bedroom houses?
v. Of all the properties listed for auction, how many 3 bedroom units,
including those that were sold prior to and those that were sold after,were
sold that week? Then, express this as a percentage of all the 3 bedroom
units listed for auction that week excluding those that were withdrawn
from sale.
(13 marks)
d. Separate the data into two data sets, one consisting of the units data only and one
containing the houses data only.
i. Use Excel to generate separate tables of descriptive statistics for houses
and units for the variable Price’ and include them in your assignment
submission. Round both means to the nearest thousand dollars and both
standard deviations to the nearest ten dollars.
ii. What was the price of the cheapest property sold that week? Look further
afield to include information about what type of property it was, how
many bedrooms it had, whether it sold at auction or before or after, and
which real estate agent sold it.
iii. The sample variance may have been expressed in an unusual way in one or
both of these tables generated in part (i.). Explain this unusual notation
and the numerical value it represents (in one or both tables).
(8 marks)
e. Using the data set for units only, use Excel to prepare a frequency distribution
and histogram of the variable ‘Price’ for the unit data. Use $500 000 as the upper
limit of the first class and a class width of $500 000.
(5 marks)
f. After preparing this histogram, discuss whether the choice of classes suggested
above is appropriate. Refer to important aspects such as the number of classes, the
width of the classes and whether all data are included in the classes chosen.
(3 marks)
g. Generate a boxplot for the variable ‘Price’for the units data only. Include the
5-number summary generated by Excel and the boxplot with your
assignment submission.
(4 marks)
h. Answer the following questions regarding the data for the units only and indicate
the particular output in (d), (e) and (g) which provided the answer.
i. How many units had a sale price listed?
ii. How many outliers are there in the distribution of the selling prices of
units ? What is the value of the largest outlier?
iii. 25% of units sold for $x or less. What is x?
iv. Comment on the shape of the distribution of the ‘Price’ variable for units
only (skewed, symmetric, direction of skewness if relevant, unimodal,
bimodal,etc.). Provide at least three items of supporting evidence from the
output generated in parts (d), (e) and (g).
(10 marks)
i. If a media outlet were to quote the average selling price (of units listed for auction
that week), would it be more appropriate to quote the mean or the median price?
Why?
(2 marks)
j. Sharon and Mark would like to know the selling success rate of the all the
properties that were listed. For all the properties that were listed for auction,
generate a two way pivot table of ‘Type’ by ‘Result’.(Hint: include ‘Result’ in
both the column and in the body of the table.) From this pivot table, generate a
single horizontal 100% component bar chart with the variable ‘type’ plotted along
the vertical axis and the different types of ‘result’ making up the components of
each bar. Include both the pivot table and the bar chart with your assignment
submission.
Would it be correct for Sharon and Mark to conclude from this chart that more
units than houses were sold prior to the auction? Explain.
(7 marks)
NB Don’t forget to go back and complete Q2 part b.

+1 862 207 3288 