Quantitative Methods:

Module 10: Analysis of Variance

Introduction

Analysis of variance is a type of significance test that allows, in a single test, several samples to be compared under the hypothesis that they have all come from the same population (or that they come from populations that have the same mean). There is both one way and two way analysis of variance.

one way analysis of variance

The underlying theory of one-way analysis of variance is to compare the variation between the treatments (the groups or samples) with the variation within the treatments.

 

If the former is significantly greater than the latter, then the differences between the means of the treatments must be significantly greater than would be anticipated by chance.

 

Tests for differences in the treatments variation of the data into two parts:

 

two way analysis of variance

 

Allows a source of variation to be isolated before testing the effect of the treatments.  Just as the first source of variation is conventionally referred to as the treatments, so the second source is referred to as the blocks.

 

One Way Analysis of Variance

ANOVA Process

 

1) Find the Total Sum of Squares (Total SS): SS=S(xij- )2 What does this mean?

 

This means subtract each observation (typically an observation value in a treatment column), from the grand mean (the mean of the means of each treatment column), and square it, and then sum the squares.

 

2) Find the Sum of Squares Between Treatments (SST):

 

SST=No. Observations x S(xj - )2

 

This means to subtract mean of each treatment column from the grand mean, to square each difference, to sum the squares, and then to multiply times the number of observations in each column.

 

3) Find the Error Sum of Squares (SSE): SSE=SS-SST

 

Also, SSE is same as SS, except that each observation is subtracted from its treatment mean, rather than the grand mean ().

 

4) Calculate Mean Squares:  This is the treatment mean square (MST), and the error means (MSE). These are SST and SSE divided by degrees of freedom.

 

MST=SST/(Number of Treatments -1)

 

MSE=SSE/(Number of Observations for Each Treatment -1) x (Number of Treatments)

 

The relevance of the mean square to this process is that they are the basis of a significance test to determine whether explained variation (between treatments) is significantly different from unexplained variation (within treatments). The ratio between the MST and the MSE follows an F distribution.

5) Conduct significance test:

a) Hypothesis: All treatments come from the same population

b) If the observed F value lies beyond the critical value, at a given level of significance, then the hypothesis is rejected.

c) If  the observed F value lies below the critical value, at a given level of significance, then the hypothesis is accepted.

d) Find Observed F Value: MST/MSE=Observed F

 

3 underlying assumptions

basically an F test

1)     observations are supposed to have come from a normal distribution

2)     observations taken at random

3)     test is based on the treatment groups having come form a common population or from populations with equal means *

 

ANOVA Table

 

Perform an ANOVA by following steps described above, (summarized below), and systematically filling out ANOVA table.

ANOVA Process

 

1) Find the Total Sum of Squares (Total SS):

2) Find the Sum of Squares Between Treatments (SST):

3) Find the Error Sum of Squares (SSE): SSE=SS-SST

4) Calculate Mean Squares

MST=SST/(Number of Treatments -1)

MSE=SSE/(Number of Observations for Each Treatment -1) x (Number of Treatments)

5) Conduct significance test:

a) Hypothesis: All treatments come from the same population

b) If the observed F value lies beyond the critical value, at a given level of significance, then the hypothesis is rejected.

c) If  the observed F value lies below the critical value, at a given level of significance, then the hypothesis is accepted.

d) Find Observed F Value: MST/MSE=Observed F

 

ANOVA Table

 

Variation

Degrees of Freedom

Sums of Squares

Mean Square

F

Explained by Treatments (between columns)

c-1

SST

MST

MST/MSE

Error or unexplained (within columns)

(r-1)c

SSE

MSE

 

Total

rc-1

SS

 

 

 

 

 

 

Two Way Analysis of Variance

 

Two Way ANOVA

 

Two way ANOVA requires an additional sum of squares, the SSB. This is calculated similar to SST --as shown again below, but;

 

“SST=No. Observations x S(i - )2

 

This means to subtract mean of each treatment column from the grand mean, to square each difference, to sum the squares, and then to multiply times the number of observations in each column.”

 

The SSB = No.Treatment x S  ( i -  )2, meaning to subtract the mean of each ROW not COLUMN from the grand mean, squaring each difference, summing each square together, and multiplying times the number of treatment column.

 

So, we have for two-way ANOVA:

 

SS

SST

SSB

SSE

 

Remember: Calculate SSE from the Equation-

Total SS= SST + SSB + SSE

 

Using these find MST, and MSE, but degrees of freedom are now different for MSE, as follows:

 

MST= SST/(c-1)

 

MSE= SSE/(c-1)(r-1)

 

Two Way ANOVA Table

 

Variation

Degrees of Freedom

Sums of Squares

Mean Square

F

Explained by Treatments (between columns)

c - 1 =

SST

MST

MST/MSE

Explained by Blocks (between rows)

r - 1 =

SSB

MSB

 

Error of unexplained (within columns)

(r - 1) (c-1) =

SSE

MSE

 

Total

rc-1=

SS=

 

 

 

 Remember: Calculate SSE from the Equation-

Total SS= SST + SSB + SSE

 

Extensions of Analysis of Variance

 

1) Test the effect of blocks:  Can be done by calculating F by using MSB/MSE

 

2) Interaction variable: A sum of squares variable to capture interaction effects is a simple extension of ANOVA.

 

3) Balanced Design: Take care to make all groups or samples the same size. Very difficult to analyze otherwise.

 

4) Treatments (columns) and blocks (rows) are referred to as factors. One and two way analysis can be extended to multi-factor situations. Principles are the same, just more number crunching.

 

Key Message from Module:

 

Allows significance tests to be used for more realistically in areas such as market research, medicine, and agriculture.

Analysis of variance and research design. 

 Proper experimental design sets the stage for a simple clean statistical test.

Isolate the effects as simply as possible.

Very careful planning of the research.