Quantitative Methods:
Module 10: Analysis of
Variance
Analysis of variance is a type of significance test that
allows, in a single test, several samples to be compared under the hypothesis
that they have all come from the same population (or that they come from
populations that have the same mean). There is both one way and two way
analysis of variance.
The underlying theory of one-way analysis of variance is to
compare the variation between the treatments (the groups or samples) with the
variation within the treatments.
If the former is
significantly greater than the latter, then the differences between the means
of the treatments must be significantly greater than would be anticipated by
chance.
Tests for differences in the treatments variation of the
data into two parts:
Allows a source of
variation to be isolated before testing the effect of the treatments. Just as the first source of variation is
conventionally referred to as the treatments, so the second source is referred
to as the blocks.
ANOVA Process
1) Find the Total Sum of Squares (Total SS): SS=S(xij-
)2 What does this mean?
This means subtract each observation (typically an
observation value in a treatment column), from the grand mean (the mean of the
means of each treatment column), and square it, and then sum the squares.
2) Find the Sum of Squares Between Treatments (SST):
SST=No. Observations x S(xj -
)2
This means to subtract mean of each treatment column from
the grand mean, to square each difference, to sum the squares, and then to
multiply times the number of observations in each column.
3) Find the Error Sum of Squares (SSE): SSE=SS-SST
Also, SSE is same as SS, except that each observation is
subtracted from its treatment mean, rather than the grand mean (
).
4) Calculate Mean Squares:
This is the treatment mean square (MST), and the error means (MSE).
These are SST and SSE divided by degrees of freedom.
MST=SST/(Number of Treatments -1)
MSE=SSE/(Number of Observations for Each Treatment -1) x
(Number of Treatments)
The relevance of the mean square to this process is that
they are the basis of a significance test to determine whether explained
variation (between treatments) is significantly different from unexplained
variation (within treatments). The ratio between the MST and the MSE follows an
F distribution.
5) Conduct significance test:
a) Hypothesis: All treatments come from the same population
b) If the observed F value lies beyond the critical value,
at a given level of significance, then the hypothesis is rejected.
c) If the observed F
value lies below the critical value, at a given level of significance, then the
hypothesis is accepted.
d) Find Observed F Value: MST/MSE=Observed F
3 underlying assumptions
basically an F test
1)
observations are supposed to have come from a normal distribution
2)
observations taken at random
3)
test is based on the treatment groups having come form a
common population or from populations with equal means *
ANOVA Table
Perform an ANOVA by following steps described above,
(summarized below), and systematically filling out ANOVA table.
ANOVA Process
1) Find the Total Sum of Squares (Total SS):
2) Find the Sum of Squares Between Treatments (SST):
3) Find the Error Sum of Squares (SSE): SSE=SS-SST
4) Calculate Mean Squares
MST=SST/(Number of Treatments -1)
MSE=SSE/(Number of Observations for Each Treatment -1) x
(Number of Treatments)
5) Conduct significance test:
a) Hypothesis: All treatments come from the same population
b) If the observed F value lies beyond the critical value,
at a given level of significance, then the hypothesis is rejected.
c) If the observed F
value lies below the critical value, at a given level of significance, then the
hypothesis is accepted.
d) Find Observed F Value: MST/MSE=Observed F
|
Variation |
Degrees of Freedom |
Sums of Squares |
Mean Square |
F |
|
Explained by Treatments (between columns) |
c-1 |
SST |
MST |
MST/MSE |
|
Error or unexplained (within columns) |
(r-1)c |
SSE |
MSE |
|
|
Total |
rc-1 |
SS |
|
|
Two Way ANOVA
Two way ANOVA requires an additional sum of squares, the
SSB. This is calculated similar to SST --as shown again below, but;
“SST=No. Observations x S(
i -
)2
This means to subtract mean
of each treatment column from the grand mean, to square each difference, to sum
the squares, and then to multiply times the number of observations in each
column.”
The SSB = No.Treatment x S (
i -
)2, meaning
to subtract the mean of each ROW not COLUMN from the grand mean, squaring each
difference, summing each square together, and multiplying times the number of
treatment column.
So, we have for two-way ANOVA:
SS
SST
SSB
SSE
Remember: Calculate SSE from the Equation-
Total SS= SST + SSB + SSE
Using these find MST, and MSE, but degrees of freedom are
now different for MSE, as follows:
MST= SST/(c-1)
MSE= SSE/(c-1)(r-1)
Two Way ANOVA Table
|
Variation |
Degrees of Freedom |
Sums of Squares |
Mean Square |
F |
|
Explained by Treatments (between columns) |
c - 1 = |
SST |
MST |
MST/MSE |
|
Explained by Blocks (between rows) |
r - 1 = |
SSB |
MSB |
|
|
Error of unexplained (within columns) |
(r - 1) (c-1) = |
SSE |
MSE |
|
|
Total |
rc-1= |
SS= |
|
|
Remember: Calculate
SSE from the Equation-
Total SS= SST + SSB + SSE
1) Test the effect of
blocks: Can be done by calculating
F by using MSB/MSE
2) Interaction
variable: A sum of squares variable to capture interaction effects is a
simple extension of ANOVA.
3) Balanced Design:
Take care to make all groups or samples the same size. Very difficult to
analyze otherwise.
4) Treatments (columns) and blocks (rows) are referred to as
factors. One and two way analysis
can be extended to multi-factor
situations. Principles are the same, just more number crunching.
Allows significance tests to be used for more realistically
in areas such as market research, medicine, and agriculture.
Analysis of variance and research design.
Proper experimental design sets the stage for
a simple clean statistical test.
Isolate the effects as simply as possible.
Very careful planning of the research.