iStockphoto/Thinkstock

iStockphoto/Thinkstock

ANOVA: Analyzing Differences in Multiple Groups
Learning Objectives
After reading this chapter, you should be able to: • Describe the similarities and differences between t-tests and ANOVA. • Explain how ANOVA can help address some of the problems and limitations associated with t-tests. • Use ANOVA to analyze multiple group differences. • Use post hoc tests to pinpoint group differences. • Determine the practical importance of statistically significant findings using effect sizes with eta-squared.

Section 5.1 From t-Test to ANOVA

CHAPTER 5

Chapter Overview
5.1 From t-Test to ANOVA The ANOVA Advantage Repeated Testing and Type I Error 5.2 One-Way ANOVA Variance Between and Within The Statistical Hypotheses Measuring Data Variability in the ANOVA Calculating Sums of Squares Interpreting the Sums of Squares The F Ratio The ANOVA Table Interpreting the F Ratio Locating Significant Differences Determining Practical Importance 5.3 Requirements for the One-Way ANOVA Comparing ANOVA and the Independent t One-Way ANOVA on Excel 5.4 Another One-Way ANOVA Chapter Summary

Introduction

D

uring the early part of the 20th century R. A. Fisher worked at an agricultural research station in rural southern England. In his work analyzing the effect of pesticides and fertilizers on results like crop yield, he was stymied by the limitations in Gosset’s independent samples t-test, which allowed him to compare just two samples at a time. In the effort to develop a more comprehensive approach, Fisher created a statistical method he called analysis of variance, often referred to by its acronym, ANOV A, which allows for making multiple comparisons at the same time using relatively small samples.

5.1 From t-Test to ANOVA

T

he process for completing an independent samples t-test in Chapter 4 illustrated a number of things. The calculated t value, for example, is a score based on a ratio, one determined by dividing the variability between the two groups (M1 2 M2) by the variability within the two groups, which is what the standard error of the difference (SEd) measures. So both the numerator and the denominator of the t-ratio are measures of data variability, albeit from different sources. The difference between the means is variability attributed primarily to the independent variable, which is the group to which individual subjects belong. The variability in the denominator is variability for reasons that are unexplained—error variance in the language of statistics.

Section 5.1 From t-Test to ANOVA

CHAPTER 5

Key Terms: Analysis of variance is a test of significant differences among two or more independent groups, when the IV is nominal and the DV is interval or ratio.

In his method, ANOV A, Fisher also embraced this pattern of comparing between-groups variance to within-groups variance. He calculated the variance statistics differently, as we shall see, but he followed Gosset’s pattern of a ratio of between-groups variance compared to within.

The ANOVA Advantage
ANOV A and the t-test answer the same question—are differences between groups statistically significant? Why bother with another test that answers the same question that the t-test answers? Suppose a utility company wants to compare the per customer consumption of natural gas in three areas: a, b, and c. Why not answer the question by performing three t-tests as follows? Test 1 compares area a to area b. Test 2 compares area b to area c. Test 3 compares area a to area c. Although the three tests involve all possible comparisons, there are two problems. The first is that the thought of all possible comparisons becomes strained when there are many groups involved in the analysis. If there were five different areas in the natural gas consumption example and they were labeled a through e, note the number of comparisons needed to cover all possible combinations: 1. a to b 2. a to c 3. a to d 4. a to e 5. b to c 6. b to d 7. b to e 8. c to d 9. c to e 10. d to e Even if the sheer number of comparisons were the only problem, conducting 10 tests in order to cover all possible combinations would strain the patience of most analysts. There is another advantage to ANOV A over t-tests. The t-test accommodates just one independent variable. Area “a” can be compared to area “b,” or women can be compared to men, or Republicans to Democrats, but there is no way to combine IVs and compare, for example, female Republicans to male Democrats, or gas consumption by single-family residences in area “a” to consumption by multifamily residences in area “b.” Factorial ANOVA, which is covered in Chapter 6, besides handling any number of groups, will also accommodate any number of independent variables, as long as they are categorical variables.

Section 5.2 One-Way ANOVA

CHAPTER 5

Repeated Testing and Type I Error
But there is another problem and it is more insidious. Key Terms: Factorial Recall that the potential for type I error is indicated ANOVA is ANOVA with by the level at which the test is conducted. When testmore than one independent ing at p 5 .05, any significant finding will actually be variable. One-way ANOVA a type I error an average of 5% of the time. However, involves just one IV. that level of error is based on the assumption that each test is completely independent; it assumes that every time a statistical test is completed, it is conducted with fresh data. If statistical testing is done repeatedly with the same data, the potential for type I error doesn’t remain fixed at .05 (or whatever the level of the testing is), but rather grows. In fact, if 10 tests are conducted in succession with the same data as with groups labeled a, b, c, d, and e above, and each finding is significant, by the time the 10th test is completed, the potential for alpha error is .40! The potential for alpha error for any Review Question A: number of repeated tests can be calculated, although we will When successive tests not bother to do it here. The point is that after 10 statistically with the same data significant findings, the probability of a type I error is 4 in 10 indicate significance, or 40%! The accuracy in determining significance is reduced what happens to the nearly to the level of probability offered by a coin flip (the probability of type I probability of obtaining heads or tails in a coin flip is 50%). error? The foregoing brings us to this: multiple t-tests with the same data are not an option.

5.2 One-Way ANOVA

A

nalysis of variance allows for one test to make comparisons between any number of groups so that there is just one probability for alpha error. In the example above, the five groups can be compared for statistically significant differences in the same analysis. The result will indicate whether there are significant differences in natural gas consumption anywhere among the several groups.

Here, the focus is on ANOV A in its simplest form, the procedure called one-way ANOVA, with the “one” indicating just one independent variable. In that regard this form of ANOV A is similar to the independent samples t-test. The difference is that the IV in ANOV A can have two or more categories. In theory, there is no upper limit to the number.

Variance Between and Within
Fisher’s test is based on the understanding that when several subjects are measured on some characteristic, the scores can vary for some combination of two reasons, either because of the impact of the independent variable (their group membership) or because of the error variance that stems from other, uncontrolled influences.

Section 5.2 One-Way ANOVA

CHAPTER 5

The F ratio that is calculated as a test statistic in ANOV A is a measure including the IV effect, divided by a measure that is entirely error variance. When F meets or exceeds a critical value, it indicates that the effect of the independent variable is great enough that the difference between at least two of the groups is not random. When the F ratio is smaller than a critical value, it indicates that differences between groups that can be attributed to the independent variable are not significant compared to the error variance. Three groups of the same size selected from one population might be represented by the three distributions in Figure 5.1. They do not have exactly the same means because of sampling error. Even randomly selected samples are rarely identical, but they were all drawn from a common population.

Figure 5.1: Three groups drawn from the same population

The range within each of the three groups reflects the fact that even people in the same group will often differ regarding whatever is measured. For example, someone interested in analyzing job satisfaction among workers in the same industry will find that job satisfaction varies, even among people of the same gender, the same age, and the same level of experience. The differences within the group indicate the error variance. The issue in analysis of variance is whether the different manifestations of the IV create enough variability between the groups that the ratio of between-groups variance to within-groups variance exceeds a critical value. In other words, do the multiple samples still represent populations with the same mean? Alternatively, the IV, also sometimes called the “grouping variable,” can be a particular intervention or treatment. For example, if three groups of workers are offered three different incentives, do these different incentives affect job satisfaction differently (Figure 5.2), so that the three groups no longer represent populations with the same means?

Section 5.2 One-Way ANOVA

CHAPTER 5

Figure 5.2: Three groups after a treatment or an intervention

The within-groups variability in these three distributions is the same as it was in the distributions in Figure 5.1. It is the between-groups variability that has changed. More particularly, it’s the difference between the group means that has changed. Although there was some between-groups variability before the treatment, it was the effect of sampling variability. After the treatment the differences between means are much greater. F will indicate whether the differences are great enough to be statistically significant.

The Statistical Hypotheses
For the t-test the null hypothesis was written Ho: m1 5 m2, indicating that the two samples involved were drawn from populations with the same means. For a one-way ANOV A with three groups, the null hypothesis indicates that three samples represent populations with the same means: Ho: m1 5 m2 5 m3 For the alternate hypothesis, however, there is not just one possible alternative. Each of the following outcomes might occur: a. HA: m1 ? m2 5 m3: Sample 1 represents a population with a mean value different from the means of the populations represented by samples 2 and 3. b. HA: m1 5m2 ? m3: Samples 1 and 2 represent populations with mean values different from the mean of the population represented by sample 3. c. HA: m1 5m3 ? m2: Samples 1 and 3 represent a population with a mean value different from the population represented by sample 2. d. HA: m1 ? m2 ? m3: All three samples represent populations with different means. In the job satisfaction example, maybe two of the incentives, pay raises and end-of-year bonus, had similar effects on job satisfaction, while the third incentive, say additional vacation time, had little or no effect. Since the several possible alternative outcomes

Section 5.2 One-Way ANOVA

CHAPTER 5

multiply rapidly when the number of groups increases, a more general alternate hypothesis is given. Either the groups involved come from populations with the same means, or at least one does not. The alternate to the null hypothesis is simply stated: HA: not so

Measuring Data Variability in the ANOVA
There are several statistics that indicate data variability. So far in the book we have used each of the following: • • • • • the standard deviation (s), the variance (s2), the standard error of the mean (SEM), the standard error of the difference (SEd), the range (R). Review Question B: If there were four groups involved in a one-way ANOVA, how many possible pairs of groups are there?

Analysis of variance adds one more measure of data variability, the sum of squares (SS), which for the one-way ANOV A has three forms. There is the sum of squares total, SStot, which is all variability from all sources. The sum of squares between, SSbet, measures the effect of the IV, the “grouping variable” or the “treatment effect.” The sum of squares within, SSwith, or the SSerror, is a measure entirely of error variance. A. The sum of squares total, SStot, is the sum of the squared differences between each score in all groups and the mean of all data. SStot 5 ?(x 2 MG)2

Formula 5.1 Where

x 5 each score in all groups MG 5 the mean of all data, the “grand” mean To calculate SStot, 1. Sum all scores from all groups and divide by the number of scores to determine the grand mean, MG. 2. Subtract MG from each score (x) in each group, and then square the difference: (x 2 MG)2. 3. Sum all of the squared differences: ?(x 2 MG)2. B. The sum of squares between, SSbet, is the sum of the squared differences between the means of the groups and the mean of all the data, times the number in each group. Formula 5.2 SSbet 5 (Ma 2 MG)2na1 (Mb 2 MG)2nb 1 (Mc 2 MG)2nc

Section 5.2 One-Way ANOVA

CHAPTER 5

Where Ma 5 the mean of the scores in the first group, “a” MG 5 the same grand mean used in SStot na 5 the number of scores in the first group, “a” To calculate SSbet, 1. Determine the mean for each group, Ma, Mb, and so on. 2. Subtract MG from each sample mean and square the difference, (Ma 2 MG)2. 3. Multiply the squared differences by the number in the group, (Ma2MG)2na. 4. Repeat for each group. 5. Sum the results across groups. Key Terms: Data variability in ANOVA is measured by sum of squares. Total sum of squares indicates all data variability. Sum of squares between includes the variance related to the IV. Sum of squares within is error variance.

C. The sum of squares within, SSwith, is the sum of the squared differences between individuals in the groups and the particular group mean.
Formula 5.3 Where SSwith 5 the sum of squares within xa refers to each of the individual scores in group “a” Ma 5 the score mean in group “a” To calculate SSwith: 1. From each score in each group: a. Subtract the mean of the group. b. Square the difference. c. Sum the squared differences within each group. 2. Repeat this for each group. 3. Sum the results across the groups. SStot consists of the SSbet and the SSwith, so it follows that: Formula 5.4 SStot 5 SSbet 1 SSwith SSwith 5 ?(xa?2?Ma)2?1??(xb?2?Mb)2 1 ?(xc?2?Mc)2

Section 5.2 One-Way ANOVA

CHAPTER 5

This means that once two of the three have been determined, the third can be calculated by subtraction. For example, SStot – SSbet 5 SSwith. Although we will determine the SSwith this way in an example below, it is a good idea to be cautious in following this approach because if there is an error in calculating either SStot or SSbet, it is perpetuated by using subtraction to determine SSwith.

Calculating Sums of Squares
Suppose that the service manager at a local auto dealership would like to find out the particular price for oil changes that will bring in the most customers. Coupons are offered in the local newspaper for $30 oil changes in March, $25 oil changes in April, and $20 oil changes in May. In this example, the monetary value of the coupon is the IV, and the number of oil changes bought is the DV. The question is whether price is related to the number of oil changes. The number of oil changes bought on four successive Fridays in each of the three months is as follows: March: 3, 4, 4, 3 April: 6, 6, 7, 8 May: 6, 7, 7, 9 Recall that both SStot and SSbet require the mean values. For MG, verify that for all three groups, ?x 5 70, and N 5 12, so MG 5 5.833 For March, ?xa 5 14, na 5 4, so Ma 5 3.50 For April, ?xb 5 27, nb 5 4, so Mb 5 6.750 For May, ?xc 5 29, nc 5 4, so Mc 5 7.250 The calculations for sum of squares total and sum of squares within are fairly extensive and are in Tables 5.1 and 5.2 respectively. Those for sum of squares between are briefer and are presented in text. Verify that SStot 5 41.668

Section 5.2 One-Way ANOVA

CHAPTER 5

Table 5.1: Calculating the sum of squares total, SStot SStot 5 ?(x2MG)2 MG 5 5.833
For March ($30 Coupon):

x2M 3 2 5.833 5 22.833 4 2 5.833 5 21.833 4 2 5.833 5 21.833 3 2 5.833 5 22.833 For April ($25 Coupon): x2M 6 2 5.833 5 .167 6 2 5.833 5 .167 7 2 5.833 5 1.167 8 2 5.833 5 2.167 For May ($20 Coupon): x2M 6 2 5.833 5 .167 7 2 5.833 5 1.167 7 2 5.833 5 1.167 9 2 5.833 5 3.167

(x 2 M)2 8.026 3.360 3.360 8.026

(x 2 M)2 .028 .028 1.362 4.696

(x 2 M)2 .028 1.362 1.362 10.030 SStot 5 41.668

For SSbet, SSbet 5 (Ma 2 MG)2na1 (Mb 2 MG)2nb 1 (Mc 2 MG)2nc 5 (3.5 2 5.833)2(4) 1 (6.75 2 5.833)2(4) 1 (7.25 2 5.833)2(4) 5 21.772 1 3.364 1 8.032 5 33.168 For the error term, sum of squares within, verify that, SSwith 5 8.504

Section 5.2 One-Way ANOVA

CHAPTER 5

Table 5.2: Calculating the sum of squares within, SSwith
SSwith 5 S(xa 2 Ma)2 1 S(xb 2 Mb)2 1 S(xc 2 Mc)2

3, 4, 4, 3 6, 6, 7, 8 6, 7, 7, 9
Ma 5 3.50 Mb 5 6.750 Mc 5 7.250

For March ($30 Coupon): x2M 3 2 3.50 5 2.50 4 2 3.50 5 .50 4 2 3.50 5 .50 3 2 3.50 5 2.50 For April ($25 Coupon): x2M 6 2 6.750 5 2.750 6 2 6.750 5 2.750 7 2 6.750 5 .250 8 2 6.750 5 1.250 For May ($20 Coupon): x2M 6 2 7.250 5 21.250 7 2 7.250 5 2.250 7 2 7.250 5 2.250 9 2 7.250 5 1.750 (x 2 M)2 1.563 .063 .063 3.063 SSwith 5 8.504 Since SStot 5 SSbet 1 SSwith, you can now check your results for accuracy. For the oil-change example we have, 8.504 1 33.168 5 41.672 (x 2 M)2 .563 .563 .063 1.563 (x 2 M)2 .250 .250 .250 .250

Section 5.2 One-Way ANOVA

CHAPTER 5

In the initial calculation, SStot 5 41.668. The difference of .004 is due to number rounding and is relatively unimportant. The sums of squares values can never be negative, which should make sense since there’s no such thing as negative variability. Because they are the sums of squared differences, all SS values must be positive. The smallest value for a sum of squares is zero, which occurs when all the scores in the calculation have the same value. Squaring the differences between individual scores and group means is not a procedure unique to ANOV A. Recall when the standard deviation was calculated back in Chapter 1. At the heart of the standard deviation calculation is those repetitive x 2 M differences for each score in the sample, which were then squared and summed. In addition, the denominator in the standard deviation calculation was n 2 1, which should look suspiciously like a degrees of freedom value. Review Question C: In the independent samples t-test, the measure of within group variability is the standard error of the difference. What’s the equivalent for ANOVA?

Interpreting the Sums of Squares
The different sums of squares values are measures of data variability, and in that regard they are like the standard deviation and other measures of data variability from earlier chapters. But there is also an important difference between SS and the other statistics. Although they measure data variability, the SS values also reflect the number of scores involved in the calculation, n. Because sums of squares are in fact the sum of squared values, the more values there are, the larger the SS value becomes. With the standard deviation often the opposite occurs. Because the majority of scores in most distributions are near the mean, adding values often shrinks the value of the standard deviation. This cannot happen with the sum of squares. An additional score, whatever its value, always increases SS values. This characteristic makes the sum of squares difficult to interpret. A large SS value can indicate that individual scores tend to be highly variable, or that there are many scores in the set, or both. Fisher’s answer to this was to transform each sum of squares value into a mean measure of variability by dividing each SS by its own particular degrees of freedom; SS 4 df creates the mean square (MS). The df for the oneway ANOV A are as follows: Key Terms: The mean square provides a mean measure of data variability. It’s determined by dividing the respective sum of squares value by its degrees of freedom.

• dftot 5 N 2 1 where N is all subjects in all groups • dfbet 5 k 2 1 where k is the number of groups • dfwith, or dferror 5 N 2 k Just as SSbet?1?SSwith?5?SStot, the sum of dfbet and dfwith will equal dftot. Although there is A, a MS value associated with both the SSbet and the SSwith (SSerror) in the one-way ANOV there is no mean square total calculated. A mean level of overall variability would be of no value when answering questions about the ratio of between-groups to within-groups variability.

Section 5.2 One-Way ANOVA

CHAPTER 5

The F Ratio
The mean squares for between and within make up the F ratio, the test statistic in ANOV A. Formula 5.5 F 5 MSbet/MSwith

Dividing the MSbet by the MSwith to determine F makes it clear that the test statistic is based on how the IV, which is the grouping variable or the treatment effect (in the MSbet) compared to error (MSwith). This comparison is illustrated in Figure 5.3, where the betweengroups variance is illustrated by comparing the distance from the mean of the first distribution to the mean of the second distribution, the “A” variance, to the variances within the groups, the “B” and “C” variances.

Figure 5.3: The F-ratio: comparing variance between (A) groups to variance within (B 1 C)
A

B

C

To be statistically significant the MSbet/MSwith ratio must be greater than 1.0—the betweengroups variance must be greater than within-groups variance. How much greater is indicated by a critical value discussed below.

Section 5.2 One-Way ANOVA

CHAPTER 5

The ANOVA Table
Using the oil change example, the sums of squares and the degrees of freedom are as follows: dftot 5 N 2 1 Since N 5 12, dftot 5 11 dfbet 5 k 2 1 Since k 5 3 groups (or treatments), dfbet 5 2 dfwith 5 N 2 k With N 5 12, and k 5 3 and, dfwith 5 9 The MS values, which are SS for between and within divided by their df, are as follows: MSbet 5 SSbet / dfbet 5 33.168/2 5 16.584 MSwith 5 MSwith / dfwith 5 8.504/9 5 .945 The F value, which is the MSbet/MSwith, is: 16.584/.945 5 17.549 The ANOV A results are often presented in a table such as the one below: Source
Total Between Within

SS
41.672 33.168 8.504

df
11 2 9

MS
16.584 .945

F
17.549

Interpreting the F Ratio
The larger F is, the more likely it is to be statistically significant, but how large is large enough? Here F 5 17.549, which means that MSbet is 17.549 times greater than MSwith. To determine statistical significance, it is compared to the values in the Critical Values of F, Table 5.3, where the values are indexed to degrees of freedom for the problem.

Section 5.2 One-Way ANOVA

CHAPTER 5

Table 5.3: The critical values of F
Boldface values indicate the critical value for p 5 .01. df denominator ?2 ?3 ?4 ?5 ?6 ?7 ?8 ?9 10 11 12 13 14 15 16 17 18

df numerator
1 18.51 98.49 10.13 34.12 ?7.71 21.20 ?6.61 16.26 ?5.99 13.75 ?5.59 12.25 ?5.32 11.26 ?5.12 10.56 ?4.96 10.04 4.84 9.65 4.75 9.33 4.67 9.07 4.60 8.86 4.54 8.68 4.49 8.53 4.45 8.40 4.41 8.29 2 19.00 99.01 ?9.55 30.82 ?6.94 18.00 ?5.79 13.27 ?5.14 10.92 ?4.74 ? 9.55 ?4.46 ? 8.65 ?4.26 ? 8.02 ?4.10 ? 7.56 3.98 7.21 3.89 6.93 3.81 6.70 3.74 6.51 3.68 6.36 3.63 6.23 3.59 6.11 3.55 6.01 3 19.16 99.17 ?9.28 29.46 ?6.59 16.69 ?5.41 12.06 ?4.76 ? 9.78 ?4.35 ? 8.45 ?4.07 ? 7.59 ?3.86 ? 6.99 ?3.71 ? 6.55 3.59 6.22 3.49 5.95 3.41 5.74 3.34 5.56 3.29 5.42 3.24 5.29 3.20 5.19 3.16 5.09 4 19.25 99.25 9.12 28.71 ?6.39 15.98 ?5.19 11.39 ?4.53 ? 9.15 ?4.12 ? 7.85 ?3.84 ? 7.01 ?3.63 ? 6.42 ?3.48 ? 5.99 3.36 5.67 3.26 5.41 3.18 5.21 3.11 5.04 3.06 4.89 3.01 4.77 2.96 4.67 2.93 4.58 5 19.30 99.30 ?9.01 28.24 ?6.26 15.52 ?5.05 10.97 ?4.39 ? 8.75 ?3.97 ? 7.46 ?3.69 ? 6.63 ?3.48 ? 6.06 ?3.33 ? 5.64 3.20 5.32 3.11 5.06 3.03 4.86 2.96 4.69 2.90 4.56 2.85 4.44 2.81 4.34 2.77 4.25 6 19.33 99.33 ?8.94 27.91 ?6.16 15.21 ?4.95 10.67 ?4.28 ? 8.47 ?3.87 ? 7.19 ?3.58 ? 6.37 ?3.37 ? 5.80 ?3.22 ? 5.39 3.09 5.07 3.00 4.82 2.92 4.62 2.85 4.46 2.79 4.32 2.74 4.20 2.70 4.10 2.66 4.01 7 19.35 99.36 ?8.89 27.67 ?6.09 14.98 ?4.88 10.46 ?4.21 ? 8.26 ?3.79 ? 6.99 ?3.50 ? 6.18 ?3.29 ? 5.61 ?3.14 ? 5.20 3.01 4.89 2.91 4.64 2.83 4.44 2.76 4.28 2.71 4.14 2.66 4.03 2.61 3.93 2.58 3.84 8 19.37 99.38 ?8.85 27.49 ?6.04 14.80 ?4.82 10.29 ?4.15 ? 8.10 ?3.73 ? 6.84 ?3.44 ? 6.03 ?3.23 ? 5.47 ?3.07 ? 5.06 2.95 4.74 2.85 4.50 2.77 4.30 2.70 4.14 2.64 4.00 2.59 3.89 2.55 3.79 2.51 3.71 9 19.38 99.39 ?8.81 27.34 ?6.00 14.66 ?4.77 10.16 ?4.10 ? 7.98 ?3.68 ? 6.72 ?3.39 ? 5.91 ?3.18 ? 5.35 ?3.02 ? 4.94 2.90 4.63 2.80 4.39 2.71 4.19 2.65 4.03 2.59 3.89 2.54 3.78 2.49 3.68 2.46 3.60 10 19.40 99.40 ?8.79 27.23 ?5.96 14.55 ?4.74 10.05 ?4.06 ? 7.87 ?3.64 ? 6.62 ?3.35 ? 5.81 ?3.14 ? 5.26 ?2.98 ? 4.85 2.85 4.54 2.75 4.30 2.67 4.10 2.60 3.94 2.54 3.80 2.49 3.69 2.45 3.59 2.41 3.51 (continued)

Section 5.2 One-Way ANOVA

CHAPTER 5

Table 5.3: The critical values of F (continued)
Boldface values indicate the critical value for p 5 .01. df denominator 19 20 21 22 23 24 25 26 27 28 29 30

df numerator
1 4.38 8.18 4.35 8.10 4.32 8.02 4.30 7.95 4.28 7.88 4.26 7.82 4.24 7.77 4.23 7.72 4.21 7.68 4.20 7.64 4.18 7.60 4.17 7.56 2 3.52 5.93 3.49 5.85 3.47 5.78 3.44 5.72 3.42 5.66 3.40 5.61 3.39 5.57 3.37 5.53 3.35 5.49 3.34 5.45 3.33 5.42 3.32 5.39 3 3.13 5.01 3.10 4.94 3.07 4.87 3.05 4.82 3.03 4.76 3.01 4.72 2.99 4.68 2.98 4.64 2.96 4.60 2.95 4.57 2.93 4.54 2.92 4.51 4 2.90 4.50 2.87 4.43 2.84 4.37 2.82 4.31 2.80 4.26 2.78 4.22 2.76 4.18 2.74 4.14 2.73 4.11 2.71 4.07 2.70 4.04 2.69 4.02 5 2.74 4.17 2.71 4.10 2.68 4.04 2.66 3.99 2.64 3.94 2.62 3.90 2.60 3.85 2.59 3.82 2.57 3.78 2.56 3.75 2.55 3.73 2.53 3.70 6 2.63 3.94 2.60 3.87 2.57 3.81 2.55 3.76 2.53 3.71 2.51 3.67 2.49 3.63 2.47 3.59 2.46 3.56 2.45 3.53 2.43 3.50 2.42 3.47 7 2.54 3.77 2.51 3.70 2.49 3.64 2.46 3.59 2.44 3.54 2.42 3.50 2.40 3.46 2.39 3.42 2.37 3.39 2.36 3.36 2.35 3.33 2.33 3.30 8 2.48 3.63 2.45 3.56 2.42 3.51 2.40 3.45 2.37 3.41 2.36 3.36 2.34 3.32 2.32 3.29 2.31 3.26 2.29 3.23 2.28 3.20 2.27 3.17 9 2.42 3.52 2.39 3.46 2.37 3.40 2.34 3.35 2.32 3.30 2.30 3.26 2.28 3.22 2.27 3.18 2.25 3.15 2.24 3.12 2.22 3.09 2.21 3.07 10 2.38 3.43 2.35 3.37 2.32 3.31 2.30 3.26 2.27 3.21 2.25 3.17 2.24 3.13 2.22 3.09 2.20 3.06 2.19 3.03 2.18 3.00 2.16 2.98

Source: Critical Values of F. (2011). Retrieved from http://faculty.vassar.edu/lowry/apx_d.html

As with the t-test, as degrees of freedom increase, the critical values decline. Unlike t, F has two df values, one for the MSbet, the other for the MSwith. Using them together identifies the critical value from the table. • In Table 5.3, the critical value is identified by first moving across the top of the table to the value of dfbet, since the numerator of the F ratio is dfbet. In this example, dfbet 5 2.

Section 5.2 One-Way ANOVA

CHAPTER 5

• The second step is moving down the left side of the table to the value of dfwith, the numerator of the F ratio. In this example, dfwith 5 9. • The intersection of the 2 across the top and 9 along the left side of the table leads to two critical values, one in plain print which is for p 5 .05, and one in boldface, which is the value for testing at p 5 .01. • The critical value when testing at p 5 .05 for 2 and 9 degrees of freedom is 4.26. The fact that the calculated value for F is larger than the table value indicates that the difference in the number of oil changes is probably related to the price charged, or to say it the other way around, the differences are probably not just an artifact of sampling variability. The sales manager can reject Ho.

Locating Significant Differences
When there were only two groups involved in an independent samples t-test, it was relatively easy to interpret a significant t. It indicates that the two groups represent populations with different means. A significant F in an ANOV A with more than two groups is not so straightforward. It indicates that at least one group is significantly different from at least one other group in the study, but unless there are only two groups in the ANOV A, it is not clear which group is significantly different from which. If the null hypothesis is rejected, there are a number of possibilities, as we noted earlier. To further pinpoint which pairs of groups are significantly different from each other, post hoc tests (a Latin expression that means “after this”) are conducted following a significant F. There are many post hoc tests, each with particular strengths, but one of the more common, and one of the easier to calculate, is called Tukey’s HSD (for “honestly significant difference”). The Tukey’s formula (5.6) produces a value which represents the smallest difference between the means of any two samples in a significant ANOV A that can be statistically significant: Key Terms: The post hoc test provides a way to determine which group is significantly different from which when there are more than two groups in a test. Formula 5.6 Where HSD 5 x” 1 MSwith /n 2

x 5 a table value determined by the number of groups (k) in the problem and the A table degrees of freedom within (dfwith) from the ANOV A table MSwith 5 the value from the ANOV n 5 the number in one group when group sizes are equal. (When group sizes are unequal, a different HSD value is calculated for each pair of groups.)

Section 5.2 One-Way ANOVA

CHAPTER 5

Table 5.4: Critical values for Tukey’s HSD df for Error Term
5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 24 30 40

k 5 Number of Treatments
2 3.64 5.70 3.46 5.24 3.34 4.95 3.26 4.75 3.20 4.60 3.15 4.48 3.11 4.39 3.08 4.32 3.06 4.26 3.03 4.21 3.01 4.17 3.00 4.13 2.98 4.10 2.97 4.07 2.96 4.05 2.95 4.02 2.92 3.96 2.89 3.89 2.86 3.82 3 4.60 6.98 4.34 6.33 4.16 5.92 4.04 5.64 3.95 5.43 3.88 5.27 3.82 5.15 3.77 5.05 3.73 4.96 3.70 4.89 3.67 4.84 3.65 4.79 3.63 4.74 3.61 4.70 3.59 4.67 3.58 4.64 3.53 4.55 3.49 4.45 3.44 4.37 4 5.22 7.80 4.90 7.03 4.68 6.54 4.53 6.20 4.41 5.96 4.33 5.77 4.26 5.62 4.20 5.50 4.15 5.40 4.11 5.32 4.08 5.25 4.05 5.19 4.02 5.14 4.00 5.09 3.98 5.05 3.96 5.02 3.90 4.91 3.85 4.80 3.79 4.70 5 5.67 8.42 5.30 7.56 5.06 7.01 4.89 6.62 4.76 6.35 4.65 6.14 4.57 5.97 4.51 5.84 4.45 5.73 4.41 5.63 4.37 5.56 4.33 5.49 4.30 5.43 4.28 5.38 4.25 5.33 4.23 5.29 4.17 5.17 4.10 5.05 4.04 4.93 6 6.03 8.91 5.63 7.97 5.36 7.37 5.17 6.96 5.02 6.66 4.91 6.43 4.82 6.25 4.75 6.10 4.69 5.98 4.64 5.88 4.59 5.80 4.56 5.72 4.52 5.66 4.49 5.60 4.47 5.55 4.45 5.51 4.37 5.37 4.30 5.24 4.23 5.11 7 6.33 9.32 5.90 8.32 5.61 7.68 5.40 7.24 5.24 6.91 5.12 6.67 5.03 6.48 4.95 6.32 4.88 6.19 4.83 6.08 4.78 5.99 4.74 5.92 4.70 5.85 4.67 5.79 4.65 5.73 4.62 5.69 4.54 5.54 4.46 5.40 4.39 5.26 8 6.58 9.67 6.12 8.61 5.82 7.94 5.60 7.47 5.43 7.13 5.30 6.87 5.20 6.67 5.12 6.51 5.05 6.37 4.99 6.26 4.94 6.16 4.90 6.08 4.86 6.01 4.82 5.94 4.79 5.89 4.77 5.84 4.68 5.69 4.60 5.54 4.52 5.39 9 6.80 9.97 6.32 8.87 6.00 8.17 5.77 7.68 5.59 7.33 5.46 7.05 5.35 6.84 5.27 6.67 5.19 6.53 5.13 6.41 5.08 6.31 5.03 6.22 4.99 6.15 4.96 6.08 4.92 6.02 4.90 5.97 4.81 5.81 4.72 5.65 4.63 5.50 10 ?6.99 10.24 6.49 9.10 6.16 8.37 5.92 7.86 5.74 7.49 5.60 7.21 5.49 6.99 5.39 6.81 5.32 6.67 5.25 6.54 5.20 6.44 5.15 6.35 5.11 6.27 5.07 6.20 5.04 6.14 5.01 6.09 4.92 5.92 4.82 5.76 4.73 5.60

Source: Tukey’s HSD Critical Values. (2011). Retrieved from http://www.stat.duke.edu/courses/Spring98/sta110c/qtable.html

Section 5.2 One-Way ANOVA

CHAPTER 5

Computing HSD: 1. From Table 5.4, locate the value of x by moving across the top of the table to the number of groups (k 5 3, not k – 1 which was the dfbet), and then down the left side for the within degrees of freedom (dfwith5 9). The intersecting values are 3.95 and 5.43. The smaller of the two is the value when p 5 .05. 2. The calculation is 3.95 times the result of the square root of .945 (the MSwith) divided by 4, the number in any one group (n). 3.95 ” 1 .945 /4 2 5 1.920

This value indicates the minimum difference there can be between the means of any two groups for the groups to be significantly different. The sign of the difference does not matter; it is the absolute value we need. The mean number of oil changes in the each of the three months were as follows: When oil changes cost $30, the mean number of changes was Ma 5 3.50. When oil changes cost $25, the mean number of changes was Mb 5 6.750. When oil changes cost $20, the mean number of changes was Mc 5 7.250. Now, comparing each pair of groups: The first month minus the second month: • Ma 2 Mb 5 3.5 – 6.75 5 23.25—this difference exceeds the HSD value of 1.92 and is significant. Changing prices from $30 to $25 is associated with a significant increase in the number of oil changes. The first month minus the third month: • Ma 2 Mc 5 3.5 2 7.25 5 23.75—this difference exceeds 1.92 and is significant. Changing prices from $30 to $20 is associated with a significant increase in the number of oil changes. The second month minus the third month: • Mb 2 Mc 5 6.75 2 7.25 5 2.5—this difference is less than 1.92 and is not significant. Changing prices from $25 to $20 is not associated with a significant increase in the number of oil changes. When there are several groups involved, it is more helpful to create a table that indicates the differences between all possible pairs of means, as Table 5.5 does. Using a matrix such as the one below to summarize the difference between each pair of means makes it easier to interpret the HSD value. The shaded cells are blank because they either represent a difference between a group mean and itself (which would be zero), or a redundant difference that has already been presented in the opposite side of the “diagonal” of the matrix. For example, the absolute difference between March and April is the same as the absolute difference between April and March.

Section 5.2 One-Way ANOVA

CHAPTER 5

Table 5.5: Presenting Tukey’s HSD results in a table
HSD5 x”(MSwith/n) 5 3.95 ” 1 .945/4 2 5 1.920

Any difference between pairs of means 1.920 or greater is a statistically significant difference.

March ($30) M 5 3.50 March ($30) M 5 3.50 April ($25) M 5 6.750 May ($20) M 5 7.250
Mean differences in red are statistically significant.

April ($25) M 5 6.750 Diff 5 3.250

May ($20) M 5 7.250 Diff 5 3.750 Diff 5 .50

Determining Practical Importance
When F is significant, there are two additional questions to address. One we just took up when we used the post hoc test to determine which groups are significantly different from which other groups. In the oil change example above, the results indicate to the service manager that lowering the price from $30 to $25 or to $20 made a significant difference, but lowering the price from $25 to $20 did not. In this case, $25 seems to be the optimal price point. Note that in the calculations above the mean differences were negative, indicating that the number of oil changes sold was indeed lower when the price was higher (Ma , Mb , Mc), which is consistent with what has been expected. However, based on the results above, the mean difference of an additional half an oil change sold is not significant enough to justify lowering the price from $25 to $20, at least from a statistical standpoint. From an economic standpoint, the manager may think that a $5 sacrifice is worthwhile if it will bring in at least an additional half an oil change. However, the statistically savvy manager would understand that this additional half an oil change could have occurred by chance, since the results were statistically insignificant. An accounting-savvy manager would also take into consideration the impact of lowering the price to $20 on the revenues, as well as the costs of and profit margin on oil changes, before making a final business decision. The other question in an analysis of variance problem is about the importance of the significant result. Recall that a statistically significant outcome is one that probably did not occur by chance, but because it is statistically significant does not mean that it is necessarily

Section 5.2 One-Way ANOVA

CHAPTER 5

important. For t-tests, Cohen’s d was the statistic we used to determine the importance of an outcome. The effect size we will use for analysis of variance is called eta-squared (hp2). Like Cohen’s d, it addresses the issue of practical importance by answering the question: How much of the difference between the groups can be attributed to the independent variable? Key Terms: Eta-squared In the oil change example, the analysis was of indicates the proportion of the whether the number of oil changes sold is affected difference between groups’ by the price of the oil change. The effect size statisscores that can be explained by tics will indicate how much of that increase in sales the independent variable. is due to the price reduction, and by inference, how much is due to other factors. The formula for eta-squared is as follows:

Formula 5.7

h2 5 SSbet/ SStot

Note that it involves values that are already available from the ANOV A table. The eta-squared statistic is a very straightforward ratio of between-groups variability (SSbet) to total variability (SStot). For the oil changes problem, SSbet 5 33.168 and SStot 5 41.672. h2 5 SSbet/SStot h2 5 33.168/41.672 5 .796 The value indicates that about 80% of the difference between the number of oil changes at various price points can be attributed to the price reduction alone. The balance is due to other factors not accounted for in the analysis. They might include such variables as the weather (maybe the weather was adverse during March and that affected sales), special sales on other products or services that might have brought in additional customers, and so on. Among measures of effect size, eta-squared is one of the more liberal. It is specific to the particular sample and should not be used to predict what the effect size might be for some new analysis. If there was no error variance, all the differences in the scores would be attributable to the independent variable and the sums of squares for between and for total would have the same values. In that instance, the effect size would be 1.0. With human subjects there is always error variance; scores fluctuate for reasons other than the IV, but it’s important to know that 1.0 is the “upper bound” for this effect size. The lower bound is 0, of course— none of the variance is explained. But an h2 5 0 is also unlikely since the only time the effect size is calculated is when F is significant, and that can only happen when the effect of the IV is great enough that the ratio of MSbet to MSwith exceeds the critical value.

Section 5.3 Requirements for the One-Way ANOVA

CHAPTER 5

5.3 Requirements for the One-Way ANOVA

A

ny statistical test is based on a number of assumptions related to the nature of the data. In the case of the one-way ANOV A, the “one” indicates an important condition. • This particular test can accommodate just one independent variable.

That one variable can have any number of categories, but there can be just one IV. In the number of oil changes example, the IV was the price of the oil change. The test can accommodate any number of prices over time and test their effects on the number of oil changes sold, but it cannot factor in a second IV such as the day of week. In that regard, it is like the independent samples t-test that also accommodates just one IV, but in the case of the independent samples t-test, the IV is limited to precisely two categories. • The categories of the IV must be independent. Also like the independent samples t-test, the subjects in the groups must be separate from each other. The groups cannot include the same subjects measured multiple times, although there is a variation of ANOV A that will accommodate repeated measures; this will be discussed in Chapter 7. • The IV must be measured on a nominal scale. The IV in ANOV A must be treated as a categorical variable since the analysis is whether there are significant differences in the value of the dependent variable (number of oil changes in our example) for the different categories. Strictly speaking, the categories of the independent variable can involve data of any scale. In the oil change example, the categories were defined by the price of the oil change which is a ratio scale, continuous variable. But the independent variable must be treated as though it was categorical. It would have been impractical for the service manager to run promotions at each possible price point to determine the exact value that would bring in the highest number of customers. Instead, three “treatments” were selected, which in this case were the three specific price points of $20, $25, and $30. For this reason, ANOV A was the right choice for this analysis. ANOV A is applicable any time the data can be classified and grouped based on the IV. Had the manager more incrementally varied the price over a long period of time and observed changes in sales, another type of analysis (e.g., correlation or regression, discussed in chapters to come), would have been a better choice for the data. • The DV must be measured on an interval or ratio scale. The DV in the example was the number of oil changes, a ratio scale variable. • The groups in the analysis must be similarly distributed. The technical description for this is that there must be homogeneity of variance. It means that, for example, the groups all have reasonably similar standard deviations. • Finally, using ANOV A assumes that the samples are drawn from a normally distributed population.

Section 5.3 Requirements for the One-Way ANOVA

CHAPTER 5

Key Terms: Homogeneity of variance is a condition for ANOVA. It indicates that measures for all groups in the analysis are distributed similarly.

Although the above are requirements, Fisher’s procedure can tolerate a certain amount of deviation from these requirements. In the cryptic language of statistics, the test is quite “robust,” particularly where minor variations from data normality and homogeneity of variance are concerned.

Comparing ANOVA and the Independent t
Checking the assumptions associated with the independent samples t-test in Chapter 4 indicates that Gosset’s test and Fisher’s ANOV A share several assumptions. Although they employ distinct statistics, the sums of squares within instead of the standard error of the difference, for example, the oneway ANOV A truly is an extension of t-test. This can be illustrated by completing ANOV A and the independent t-test for the same data. Review Question D: How are the mean square values calculated in an ANOVA?

Suppose that the human resources manager of an organization would like to assess the differences in work-life balance of the employees in the marketing department versus those in the production department. The dependent variable he selects is the amount of work completed after hours at home per week. In this example, the IV or grouping variable is department (marketing versus production). The data are as follows: Marketing: 3, 4, 5, 7, 7, 9, 11, 12 Production: 0, 1, 3, 3, 4, 5, 7, 7 Calculating some of the basic statistics yields the following: M Marketing:
7.25

s
3.240

SEM
1.146

SEd
1.458

MG
5.50

Production: First the t-test:

3.75

2.550

.901

t5

M1 2 M2 7.25 2 3.7 5 5 2.104; t.051142 5 2.145 SEd 1.458

Section 5.3 Requirements for the One-Way ANOVA

CHAPTER 5

The difference is significant. Those in marketing (M1) take significantly more work home than those in production (M2). The human resources manager can conclude that employees in the marketing department are more likely to experience work-life conflict. Now the ANOV A: SStot 5 ?(x 2 MG)2 5 168 • Verify that the result of subtracting MG from each score in both groups, squaring the differences, and summing the square 5 168. SSbet 5 (Ma 2 MG)2na 1 (Mb 2 MG)2nb • This one isn’t too lengthy to do here: (7.25 2 5.50)2(8) 1 (3.75 2 5.50)2(8) 5 24.5 1 24.5 5 49 SSwith 5 ?(xa 2 Ma)21 ?(xb 2 Mb)2 • Verify that the result of subtracting the group means from each score in the particular group, squaring the differences, and summing the squares 5 119. • Check that SSwith 1 SSbet 5 SStot: 119 1 49 5 168. Source Total Between Within SS
168 49 119

df
15 1 14

MS

F

49 8.5

5.765; F.05(1,14) 5 4.60

Like the t-test, ANOV A indicates that the difference in the amount of work completed at home is significantly different for the two groups. Both tests drew the same conclusion about whether the result is significant, but the kinship between the two procedures involves more than coming to the same conclusion. • Note that the calculated value of t 5 2.401, and the calculated value of F 5 5.765. • If the value of t is squared, it equals the value of F, 2.4012 5 5.765. • The same is true for the critical values: a. t.05(14) 5 2.145 b. F.05(1,14) 5 4.60 c. 2.1452 5 4.60

Section 5.3 Requirements for the One-Way ANOVA

CHAPTER 5

When there are two groups, comparing t-test results to the one-way ANOV A makes it clear that the two tests are equivalent. There is more calculation in the ANOV A so the tendency is to use t-test for two groups, but the point is that the two tests are consistent.

One-Way ANOVA on Excel
Excel’s Analysis ToolPak includes an ANOV A procedure for PC users. To illustrate its use, suppose that a home builder is approached by a customer who wants to move in as soon as possible. The customer chooses three home designs that she likes and asks the home builder: “These three designs take approximately the same time to build, right?” To compare the three designs on speed of completion, the builder randomly selects 10 homes that he built in the past based on each design. The data for the number of days to build each home are as follows: Design A: 33, 35, 38, 39, 42, 44, 44, 47, 50, 52 Design B: 27, 36, 37, 37, 39, 39, 41, 42, 45, 46 Design C: 22, 24, 25, 27, 28, 28, 29, 31, 33, 34 1. First create the data file in Excel. Enter the names of the designs in cells A1, B1, and C1. 2. In the columns below those labels, enter the number of days, beginning in cell A2 for Design A, B2 for Design B, and C2 for Design B. Once the data are entered and checked for accuracy, continue with the following steps. 3. Click the Data tab at the top of the page. 4. At the extreme right, choose Data Analysis. Review Question E: 5. In the Analysis Tools window select ANOVA For a two-group test Single Factor and click OK. of significant differ6. Indicate where the data are located in the Input ences, how will the t Range. In the example here, the range is A2:C11. value compare to the 7. Note that the default is Grouped by Columns. If F value? the data are arrayed along rows instead of columns, this would be changed. Because we designated A2 instead of A1 as the point where the data begin, there is no need to indicate that labels are in the first row. 8. Select Output Range and enter a cell location where you wish the display of the output to begin—for example, A13. 9. Click OK.

Section 5.3 Requirements for the One-Way ANOVA

CHAPTER 5

If column A is widened to make it easier to read the output, and the decimal values are set to 3, the result is the screen-shot in Figure 5.4.

Figure 5.4: ANOVA on Excel

Below the data set Excel produces two tables. The first provides descriptive statistics. The second table looks very much like the longhand table of results for the number of oil changes example, except that: • The figures for total follow those for between and within instead of preceding them. • The column titled “P-value” indicates the probability that an F of this magnitude could have occurred by chance. Before the default was changed to 3 decimals, the “P-value” was 4.32E06. The “2E06” is scientific notation. It is a shorthand way to say that the actual value is p 5 .0000043, 4.3 with the decimal moved 6 decimals to the left. The probability easily exceeds the p 5 .05 standard for statistical significance. The P-value indicates the probability that one population distribution could contain all these differences in speed of construction. It is extremely unlikely, which is to say that the differences are statistically significant. At least one pair (Designs A&B, Designs A&C, Designs B&C) involve designs with significantly different completion times. Finding out which pair(s) is/are different would require additional post hoc tests, of course, using Tukey’s HSD, which can easily be calculated using the output from Figure 5.4.

Section 5.4 Another One-Way ANOVA ?

CHAPTER 5

5.4 Another One-Way ANOVA

T

o reinforce what has been learned, consider one more example. The manager of a machine tool company has three major clients. The question is whether sales to these three significantly differ over a three-month period. The sales totals in thousands of dollars are as follows for the period: Client 1: 23.5, 14.3, 11.0, 17.0 Client 2: 36.6, 14.7, 19.0, 14.0 Client 3: 20.1 22.7, 27.4, 16.6 The relevant means are as follows: M1 5 16.450 M2 5 21.075 M3 5 21.700 MG 5 19.742 SStot 5 ?(x?2?MG)2 5 (23.5?–?19.742)2 1 (14.3?2?19.742)2 1 . . . 1 (16.6?2?19.742)2?5?548.209

SSbet 5 (Ma 2 MG)2na 1 (Mb 2 MG)2nb 1 (Mc 2 MG)2nc 5 (16.450 2 19.742)2(4) 1 (21.075 2 19.742) 2(4) 1 (21.700 2 19.742)2(4) 5 65.792 SSwith?5?SStot?2?SSbet?5?548.209?2?65.792?5?482.417. This reflects the fact that SSwith is what is left over once SSbet is removed from SStot. Source
Total Between Within

SS
548.209 ?65.792 482.417

df
11 2 9

MS
32.896 53.602

F

.614; F.05(2,9) 5 4.26.

The difference between sales to the four clients is not statistically significant. There appears to be some difference, comparing the mean sales of the first client to those of the others, but there is so much within-group variance that the differences between clients must be attributed to sampling variability. The differences within the groups (MSwith) overwhelm the differences between (MSbet) in the F ratio.

Chapter Summary

CHAPTER 5

Chapter Summary

T

his chapter is the natural extension of Chapters 3 and 4. Like z-test and the t-tests, analysis of variance (ANOV A) is a test of significant differences. With each procedure, whether z, t, or F, the test statistic is a ratio of the differences between groups to the differences within.

Like the independent samples t-test, in ANOV A the IV is nominal and the DV is interval or ratio. Both require that groups be independent, and both procedures are limited to one independent variable that defines the groups and indicates which subjects are receiving which treatment or condition. The reason for moving to ANOV A was the need to conduct comparisons of multiple groups without running multiple tests with the same data, which can increase the probability of type I error. Analysis of variance allows any number of groups to be compared at a time, with just one test. When t and F are calculated for a twogroup test, both tests reach the same conclusion. More specifically t2 5 F (Objectives 1, 2, and 3). Multiple groups introduced a problem not present in the t-test, however. When there are more than two groups and the F is significant, it is not apparent which group(s) is/are significantly different from which. That problem was solved by calculating a post hoc test, Tukey’s HSD (Objective 4). Knowing that a result is statistically significant indicates that an outcome is probably not random. It does not establish the importance of the outcome, however. As we did with t-test, the question of the importance of a significant outcome was addressed by calculating an effect size. In general terms, the eta-squared value answers the same question answered with Cohen’s d for the t-test results: How important is the effect? The added dimension is that the statistic indicates the proportion of the variance in scores that is explained by the independent variable (Objective 5).

Answers to the Review Questions
A. With successive, significant findings with the same data, the probability of type I (alpha) error increases with each test. B. With four groups in an analysis there are 6 possible different pairs. To determine this take the total number of groups times the total number of groups minus one, divide the product by 2: (4 3 3)/2 5 6. C. The equivalent of the standard error of the difference in t-test is the MSwith in ANOV A. Both measure within-group variance. D. The mean square (MS) values in ANOV A are determined by dividing the SS values by their degrees of freedom. E. t2 5 F

Management Application Exercises

CHAPTER 5

Chapter Formulas
Formula 5.1 SStot 5 ?(x 2 MG)2 The total sum of squares; the total of all variance from all sources in an ANOV A problem. Formula 5.2 SSbet 5 (Ma 2 MG)2na 1 (Mb 2 MG)2nb 1 (Mc 2 MG)2nc The sum of squares between is a measure of how much particular groups differ from the mean of all the data. It measures the effect of the IV, the “grouping variable” or the “treatment effect.” Formula 5.3 SSwith 5 ?(xa 2 Ma)2 1 ?(xb 2 Mb)2 1 ?(xc 2 Mc)2 The sum of squares within is a measure of how much individuals within a group differ from the mean of their sample when exposed to the same level of the IV(s). It’s a measure of error variance. Formula 5.4 SStot 5 SSbet 1 SSwith Formula 5.5 F 5 MSbet/MSwith The F statistic in ANOV A. Formula 5.6 HSD 5 x” 1 MSw /n 2

Tukey’s HSD is a post hoc test used to determine which groups in an ANOV A are significantly different from each other. Formula 5.7 h2 5 SSbet/SStot Eta-squared is an estimate of effect size. It suggests the proportion of the difference in scores between significantly different groups that can be explained by the independent variable.

Management Application Exercises
1. A fleet of cabs required servicing in a particular month that took them out of service for 3.5, 3.8, 4.2, 4.5, 4.7, 5.3, 6.0, and 7.5 hours. What is the sum or squares value for these data?

Management Application Exercises

CHAPTER 5

2. Identify the symbol or statistic in a one-way ANOV A that does the following: a. The statistic that indicates the mean amount of difference between groups b. The symbol that indicates the total number of participants c. The symbol that indicates the number of groups d. The mean amount of uncontrolled variability 3. There is an advertised special on smart phones at two outlets at different locations. The sales data for eight successive days are as follows: Outlet A: 13, 14, 16, 16, 17, 18, 18, 18 Outlet B: 11, 12, 12, 14, 14, 14, 14, 16 Complete the problem as an ANOV A. Is the location difference statistically significant? 4. Complete problem 3 as an independent t-test and demonstrate the relationship between t2 and F. 5. If Outlet C offers a free data plan in connection with purchases of smart phones and the sales for 8 days are 14, 17, 19, 19, 21, 22, 25, and 27, how do sales from this outlet compare to those in Outlets A and B in item 3? 6. A labor specialist evaluates the number of hours 8 employees work the week before Thanksgiving in each of a bakery (M 5 44.5), an electronics store (M 5 36.2), and a grocery outlet (M 5 40.0). If the MSwith 5 24.50 and the dfwith 5 21, which group(s) is/are significantly different from which? 7. A courier service is comparing the number of parcels delivered on each of 5 workdays in a major city one, two, and three weeks after opening its doors. 1 week: 0, 5, 7, 8, 8, 2 weeks: 3, 5, 12, 16, 17 3 weeks: 11, 15, 16, 19, 22 a. Is F significant? b. Which week(s) is/are different from which? c. What does the effect size indicate?? 8. Regarding item 7: a. What’s the IV? b. What’s the scale of the IV? c. What’s the DV? d. What’s the scale of the DV? 9. If a shift manager is comparing the number of sick days taken by people in four departments: a. What will be the number of degrees of freedom for between? b. If there are six people in each department, what will be the degrees of freedom for within?

Key Terms

CHAPTER 5

10. The manager of an agency providing temporary employees to city offices is analyzing the number of days temporary hires typically work in different types of positions. The data are as follows: Legal clerical: 2, 1, 4, 4, 2, 5, 6 Accounting firms: 3, 6, 4, 5, 5, 7, 8 Insurance: 5, 4, 7, 9, 9, 8, 11 a. Are there significant differences in the length of time temps work in the different industries? b. How much of the difference can be explained by the industry? c. Which groups are significantly different from which?

Key Terms
• • • • • Analysis of variance is the name given to Fisher’s test that allows one to detect significant differences among any number of groups. Error variance refers to variability in a measure unrelated to the variables being analyzed. One-way ANOVA is ANOV A with one independent variable. Factorial ANOVA is ANOV A with more than one independent variable, more than one factor. Sum of squares is the variance measure in analysis of variance. It is literally the sum of squared deviations between a set of scores and its mean. Sum of squares total is total variance from all sources. Sum of squares between is the variability related to the independent variable. Sum of squares within is variability stemming from different responses from individuals in the same group. It is exclusively error variance. Mean square is the sum of squares divided by its degrees of freedom. This division allows the mean square to reflect the average amount of variability from a source. Post hoc tests are conducted after a significant ANOV A, or some similar test, which identifies which among multiple possibilities is statistically significant. Eta-squared is a measure of effect size for ANOV A. It estimates the amount of variability in the DV explained by the IV. When there is homogeneity of variance, multiple groups of data are distributed similarly.

PLACE THIS ORDER OR A SIMILAR ORDER WITH US TODAY AND GET AN AMAZING DISCOUNT 🙂

 

 

 

 

 

© 2020 customphdthesis.com. All Rights Reserved. | Disclaimer: for assistance purposes only. These custom papers should be used with proper reference.