Analysis of Variance (ANOVA)

# .fat[.fancy[Analysis of Variance (ANOVA)]]

## .fat[.fancy[MPA 6010]]

## .fat[.fancy[Ani Ruhil]]

---

# .fat[.fancy[Agenda]]

1.  The Trouble with Multiple Comparisons

2.  One-Way Analysis of Variance (ANOVA)

3.  Two-Way (Factorial) Analysis of Variance (ANOVA)

---

# .heat[.fancy[Comparing More than Two Groups]]

---

### How can you compare a continuous outcome variable when you have more than two groups?

.pull-left[
<table class="table table-striped" style="width: auto !important; margin-left: auto; margin-right: auto;">
<caption>Customer Satisfaction Scores</caption>
 <thead>
  <tr>
   <th style="text-align:center;"> Observation.No. </th>
   <th style="text-align:center;"> Atlanta </th>
   <th style="text-align:center;"> Dallas </th>
   <th style="text-align:center;"> Seattle </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:center;"> 1 </td>
   <td style="text-align:center;"> 85 </td>
   <td style="text-align:center;"> 71 </td>
   <td style="text-align:center;"> 59 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 2 </td>
   <td style="text-align:center;"> 75 </td>
   <td style="text-align:center;"> 75 </td>
   <td style="text-align:center;"> 64 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 3 </td>
   <td style="text-align:center;"> 82 </td>
   <td style="text-align:center;"> 73 </td>
   <td style="text-align:center;"> 62 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 4 </td>
   <td style="text-align:center;"> 76 </td>
   <td style="text-align:center;"> 74 </td>
   <td style="text-align:center;"> 69 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 5 </td>
   <td style="text-align:center;"> 71 </td>
   <td style="text-align:center;"> 69 </td>
   <td style="text-align:center;"> 75 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 6 </td>
   <td style="text-align:center;"> 85 </td>
   <td style="text-align:center;"> 82 </td>
   <td style="text-align:center;"> 67 </td>
  </tr>
</tbody>
</table>
]

.pull-right[
<table class="table table-striped" style="width: auto !important; margin-left: auto; margin-right: auto;">
<caption>Descriptive Statistics</caption>
 <thead>
  <tr>
   <th style="text-align:left;"> Location </th>
   <th style="text-align:right;"> Mean </th>
   <th style="text-align:right;"> Variance </th>
   <th style="text-align:right;"> Std. Dev. </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> Atlanta </td>
   <td style="text-align:right;"> 79 </td>
   <td style="text-align:right;"> 34 </td>
   <td style="text-align:right;"> 5.83 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Dallas </td>
   <td style="text-align:right;"> 74 </td>
   <td style="text-align:right;"> 20 </td>
   <td style="text-align:right;"> 4.47 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Seattle </td>
   <td style="text-align:right;"> 66 </td>
   <td style="text-align:right;"> 32 </td>
   <td style="text-align:right;"> 5.66 </td>
  </tr>
</tbody>
</table>

<img src="anova_files/figure-html/unnamed-chunk-5-1.svg" width="75%" style="display: block; margin: auto;" />
]

---

### Pairwise t-tests?

Why not compare Atlanta to Dallas, see if they differ, then compare Atlanta to Seattle, see if there is a difference, and then repeat for Dallas and Seattle? Bad idea; you run into a situation of **`multiple comparisons`**

* In any single trial we have a certain probability of a significant result by chance alone `$(\alpha)$` and hence `$1-\alpha$` is the probability of no significant result 
* What is the probability that `at least one` of these pairs throws up a significant result by chance alone? ... this is P(Type I) error `$= \alpha=0.05$`

`\begin{align*}
	P(\text{no Type I error in 1 comparison}) & = & 0.95 \\
	P(\text{no Type I error in 2 comparisons}) & = & 0.95 \times 0.95 = 0.9025 \\	
	\text{Note: P(Type I error in 2 comparisons) is} & = & 1 - 0.9025 = 0.0975 \\
	P(\text{no Type I error in 3 comparisons}) & = & 0.95 \times 0.95 \times 0.95 = 0.8573 \\	
	\text{Note:  P(Type I error in 3 comparisons) is} & = & 1 - 0.857375 = 0.1426 
\end{align*}`

---

### Correcting for Multiple Comparisons: `$\alpha^* = \frac{\alpha}{\text{No. of Trials}}$`

.pull-left[
<table class="table table-striped" style="width: auto !important; margin-left: auto; margin-right: auto;">
 <thead>
  <tr>
   <th style="text-align:center;"> Number of Trials </th>
   <th style="text-align:center;"> Adjusted Alpha </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:center;"> 1 </td>
   <td style="text-align:center;"> 0.0500 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 2 </td>
   <td style="text-align:center;"> 0.0250 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 3 </td>
   <td style="text-align:center;"> 0.0167 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 4 </td>
   <td style="text-align:center;"> 0.0125 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 5 </td>
   <td style="text-align:center;"> 0.0100 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 6 </td>
   <td style="text-align:center;"> 0.0083 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 7 </td>
   <td style="text-align:center;"> 0.0071 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 8 </td>
   <td style="text-align:center;"> 0.0063 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 9 </td>
   <td style="text-align:center;"> 0.0056 </td>
  </tr>
  <tr>
   <td style="text-align:center;"> 10 </td>
   <td style="text-align:center;"> 0.0050 </td>
  </tr>
</tbody>
</table>
]

.pull-right[
<img src="anova_files/figure-html/unnamed-chunk-7-1.svg" width="100%" style="display: block; margin: auto;" />

Reject `$H_0$` only if p-values `$\leq \alpha^*$`
]

---

# .fancy[.heat[ One-way ANOVA ]]

---

### The Logic of ANOVA

ANOVA is a hypothesis testing procedure that allows us to simultaneously compare three or more groups and determine if they are drawn from a common population or from different populations

ANOVA also lets us test the influence of two or more `independent variables` on the `dependent variable`

The test statistic is a ratio: `$\dfrac{\text{Difference between  groups}}{\text{Difference within groups}}$`

> If difference between groups `$>$` difference within each group, it must be because something differentiates the groups such that they are different

In ANOVA we use the word `treatments` to refer to the independent variable (e.g., drugs tested, trainings, etc.)

But how can we measure the `difference between groups` and the `difference within groups`?

---

### How shall we measure and analyze `difference`?

We could ask: `How much does each individual differ from the overall Mean?`

We could also ask: `In each group, how much does each group member differ from his/her group Mean?`

What would be a good measure of difference? Well, the variance of course! So, we look at a couple of variances:

(1) Total variance of `all scores`

(2) Variance `between-groups`

(3) Variance `within-groups`

> Note: We will be calculating variability in terms of the Sum of Squares, i.e., `$\sum(x_{i} - \bar{x}^{2})$`

---

### Between-groups and Within-groups Variance

If groups are from a common population, their means should be similar

When groups exhibit differences, one asks why? ... Perhaps,

(1) chance is to blame

(2) individuals have intrinsic differences

(3) individuals were exposed to treatments that differed across groups

> For example, students in catholic schools, public schools, private schools

> For example, patients exposed to Trial Drug A, Trial Drug B, Placebo

> For example, counties with no mandatory mask enforcement for COVID-19 versus counties mandating masks inside public buildings versus counties with masks in all public spaces (including sidewalks)

---

### (1) Chance

Chance refers to two possible sources of differences

(1) Person-to-Person differences (people are unique)

(2) Experimental Errors (sample-to-sample variability because of something going wrong with the experiment or sampling)
Chance could/will influence all scores, within-groups and between-groups

Key question becomes how large or small is the influence of `chance`?

---

### Between-groups and Within-groups Variance

Within-groups, differences must be due to chance `because the treatment is a constant in each group`

Between-groups, differences may be due to chance and/or treatment effects

Test statistic is a ratio of `between-groups` variance to `within-groups` variance

`$$F = \dfrac{\text{Variance Between-groups}}{\text{Variance Within-groups}}$$`

`$$\therefore F = \dfrac{\text{Variance due to Chance} + \text{Variance due  to Treatments}}{\text{Variance due to Chance}}$$`

> If variance due to treatments is `$=0$`, what will `$F$` be?

> If variance due to treatments is `$>0$` and large, what will `$F$` be?

---

### The Elements of ANOVA

In ANOVA we refer to the independent variable(s) as the `factor(s)`

The values of a factor we refer to as the `treatments`

The outcome is referred to as the `response` variable

With just one factor we speak of a `single-factor design`

With two or more factors we speak of a `factorial design`

Assumptions underlying ANOVA  
    (1) The response variable is `$\sim N(.)$` (i.e., Normally distributed)  
    (2) The variance `$\sigma^{2}$` of the response variable is the same for all groups  
    (3) The observations are independent within each group (i.e., random sampling was not violated)

The Hypotheses:  
    `$H_{0}: \text{ All population means are equal}$`  
    `$H_{1}: \text{ Not all population means are equal}$`

---

`$H_{0}: \mu_{1}=\mu_{2}=\mu_{3}$`

`$H_{1}: \text{ Not all population means are equal}$`

]

.pull-left[
`$i$` indexes observations; `$j$` indexes groups; `$\mu_{j}$` is mean of the `$j^{th}$` group; `$n_{j}$` is sample size of group `$j$`

`$x_{ij}$` is score for observation `$i$` for group `$j$`

`$\bar{x_{j}}$` is mean for group `$j$`

`$\bar{x}_{j} = \frac{1}{n_{j}} \sum_{i=1}^{n_{j}}{x_{ij}}$`
]

`$s^{2}_{j} = \frac{1}{n_{j} - 1} \sum_{i=1}^{n_{j}}{(x_{ij} - \bar{x_{j}})^{2}}$`
]

---

`$\bar{\bar{x}}$` is the overall sample mean

`$\bar{\bar{x}} = \frac{1}{n_{T}} \sum_{j=1}^{k} \sum_{i=1}^{n_{j}} {x_{ij}}$`

Note that `$n_{T} = n_{1} + n_{2} + \cdots + n_{k}$`

If `$n_{1} = n_{2} = \cdots = n_{k}$`, then `$n_{T} = k(n)$` and consequently,

`$\bar{\bar{x}} = \frac{1}{kn} \sum_{j=1}^{k} \sum_{i=1}^{n_{j}} {x_{ij}} = \frac{1}{k} \sum_{j=1}^{k} \sum_{i=1}^{n_{j}} \frac{x_{ij}}{n} = \frac{1}{k} \sum_{j=1}^{k} \bar{x_{j}}$`

`$\mbox{MSTR} = \frac{1}{k - 1} \sum_{j=1}^{k} n_{j} {(\bar{x_{j}} - \bar{\bar{x}})^{2}}$`

`$\mbox{SSTR} = \sum_{j=1}^{k} n_{j} {(\bar{x_{j}} - \bar{\bar{x}})^{2}}$`
]

`$\mbox{MSE} = \frac{1}{n_{T} - k} \sum_{j=1}^{k} (n_{j} - 1) s^{2}_{j}$`

`$\mbox{SSE} = \sum_{j=1}^{k} (n_{j}-1) s^{2}_{j}$`

Note also that `$\mbox{SST} = \mbox{SSTR} + \mbox{SSE}$`

`$\mbox{SST} = \sum_{j=1}^{k}\sum_{i=1}^{n_{j}} (x_{ij} - \bar{\bar{x}})^{2}$`

]

---

`$F = \dfrac{\mbox{MSTR}}{\mbox{MSE}}$`

`$F \sim F_{df_{Numerator}; df_{Denominator}}$`

* `$df_{Numerator} = k - 1$` 
* `$df_{Denominator} = n_{T} - k$`

Reject `$H_{0}$` if `$p-value \leq \alpha$`; Do not reject `$H_0$` otherwise

`Alternatively:` Reject `$H_{0}$` if Calculated `$F \geq$` Critical `$F$`; Do not reject `$H_0$` otherwise

> If there is as much variation between the groups as there is within the groups then the ratio will be `$= 1$`.

> If there is more variance between the groups than within them F will be `$> 1$`.

---

### A Worked Example

| Dating | Engaged | Married |
| --: | --: | --: |
| 89 | 99  | 109 |
| 90 | 100 | 110 |
| 91 | 101 | 111 |

`Overall mean:` `$\bar{\bar{x}} = \dfrac{89+90+91+\ldots+111}{9} = \dfrac{900}{9} = 100$`

`Dating mean:` `$\bar{x}_{D} = \dfrac{89+90+91}{3} = 90$`

`Engaged mean:` `$\bar{x}_{E} = 100$`

`Married mean:` `$\bar{x}_{M} = 110$`

---

### ... 
.pull-left[
| `$x$` | `$(x - \bar{\bar{x}})$` | `$(x - \bar{\bar{x}})^{2}$` |
| --: | --: | --: |
| 89 | `$89 - 100 = -11$` | `$(-11)^{2}=121$`  |
| 90 | `$90 - 100 = -10$` | `$(-10)^{2}=100$` |
| 91 | `$91-100 = -9$` | `$(-9)^{2}=81$` |
| 99 | `$99-100=-1$` | `$(-1)^{2}=1$` |
| 100 | `$100-100=0$` | `$(0)^{2}=0$` |
| 101 | `$101-100=1$` | `$(1)^{2}=1$` |
| 109 | `$109-100=9$` | `$(9)^{2}=9$` |
| 110 | `$110-100=10$` | `$(10)^{2}=100$` |
| 111 | `$111-100=11$` | `$(11)^{2}=121$` |
| `$\sum{x}=900$` | `$\sum(x-\bar{\bar{x}})=0$` | `$\sum(x-\bar{\bar{x}})^{2} =606$` | 
]

`$SSTR = (\bar{x}_D - \bar{\bar{x}})^{2} \times n_D \\ + (\bar{x}_E - \bar{\bar{x}})^{2} \times n_E \\ + (\bar{x}_M - \bar{\bar{x}})^{2} \times n_M$`

`$SSTR = (90 - 100)^2 \times 3 \\ + (100 - 100)^2 \times 3 \\ + (110 - 100)^2 \times 3$`

`$SSTR = (100) \times 3 + (0) \times 3 + (100) \times 3 = 600$`

`$SSE = SST - SSTR = 606 - 600 = 6$`
]

---

`$MSTR = \dfrac{SSTR}{k-1} = \dfrac{600}{3-1} = \dfrac{600}{2}=300$`

`$MSE = \dfrac{SSE}{n_T - k} = \dfrac{6}{9-3} = \dfrac{6}{6}=1$`

Calculated `$F=\dfrac{MSTR}{MSE} = \dfrac{300}{1} = 300$`

Critical `$F_{2,6} = 5.14$`

Since Calculated `$F >$` Critical `$F$` we conclude that relationship satisfaction varies by type of relationship

> p-value of Calculated F is `$< 0.001$`

---

When ANOVA gives you significant results, it often pays to calculate the `effect size` -- a measure of how much of the total variation in the outcome can be attributed to group differences.

Effect size can be calculated via in a number of ways, with the `eta-squared`

`$\eta^{2} = \dfrac{SSTR}{SST}=\dfrac{600}{606} = 0.99$`  being one of the more popular ones.

* `$\eta^2 < 0.5$` is a small effect 
* `$0.5 \leq \eta^2 < 0.8$` is medium effect, and 
* `$\eta^2 \geq 0.8$` is a large effect

`$\eta^{2}$` overestimates effects so often `omega-squared` `$(\omega^2)$` is used instead

`$\omega^{2} = \dfrac{SSTR - (k-1)MSE}{SST+MSE} = \dfrac{600 - (3-1)\times1}{606+1} = \dfrac{598}{607} = 0.98$`

* `$\omega^2 = 0.01$` (small effect) 
* `$\omega^2 = 0.06$` (medium effect) 
* `$\omega^2 = 0.15$` (large effect)

---

### Another Example

A program to lower blood pressure assigns participants to one of four conditions (see table), Each participant's systolic blood pressure is measured after two weeks of treatment. Hypothesis is that a combination (`Diet and Drug`) of treatments will be more effective than each individual treatment in isolation.

<table>
 <thead>
  <tr>
   <th style="text-align:right;"> Control </th>
   <th style="text-align:right;"> Diet Only </th>
   <th style="text-align:right;"> Drug Only </th>
   <th style="text-align:right;"> Diet and Drug </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:right;"> 163 </td>
   <td style="text-align:right;"> 166 </td>
   <td style="text-align:right;"> 161 </td>
   <td style="text-align:right;"> 153 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 178 </td>
   <td style="text-align:right;"> 173 </td>
   <td style="text-align:right;"> 171 </td>
   <td style="text-align:right;"> 168 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 180 </td>
   <td style="text-align:right;"> 188 </td>
   <td style="text-align:right;"> 178 </td>
   <td style="text-align:right;"> 176 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 181 </td>
   <td style="text-align:right;"> 190 </td>
   <td style="text-align:right;"> 183 </td>
   <td style="text-align:right;"> 198 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 185 </td>
   <td style="text-align:right;"> 193 </td>
   <td style="text-align:right;"> 195 </td>
   <td style="text-align:right;"> 200 </td>
  </tr>
</tbody>
</table>
]

<img src="anova_files/figure-html/unnamed-chunk-12-1.svg" width="100%" style="display: block; margin: auto;" />
]

---

`$H_0:$` There is no difference in mean systolic blood pressure between the treatment groups  
`$H_1:$` All treatment groups are not the same

Overall mean `$\bar{\bar{x}} = 179$`

<div id="htmlwidget-0177de6d2a9d4d783da5" style="width:100%;height:auto;" class="datatables html-widget"></div>
<script type="application/json" data-for="htmlwidget-0177de6d2a9d4d783da5">{"x":{"filter":"none","vertical":false,"data":[[163,178,180,181,185,166,173,188,190,193,161,171,178,183,195,153,168,176,198,200],["Control","Control","Control","Control","Control","Diet Only","Diet Only","Diet Only","Diet Only","Diet Only","Drug Only","Drug Only","Drug Only","Drug Only","Drug Only","Diet and Drug","Diet and Drug","Diet and Drug","Diet and Drug","Diet and Drug"],[-16,-1,1,2,6,-13,-6,9,11,14,-18,-8,-1,4,16,-26,-11,-3,19,21],[256,1,1,4,36,169,36,81,121,196,324,64,1,16,256,676,121,9,361,441]],"container":"<table class=\"display\">\n  <thead>\n    <tr>\n      <th>Systolic Blood Pressure<\/th>\n      <th>Treatment<\/th>\n      <th>Difference<\/th>\n      <th>Squared Difference<\/th>\n    <\/tr>\n  <\/thead>\n<\/table>","options":{"columnDefs":[{"className":"dt-right","targets":[0,2,3]}],"order":[],"autoWidth":false,"orderClasses":false}},"evals":[],"jsHooks":[]}</script>

---

Mean for Control group `$\bar{x}_{Control} = 177$`  
Mean for Diet only group `$\bar{x}_{Diet} = 182$`  
Mean for Drug only group `$\bar{x}_{Drug} = 178$`  
Mean for Diet and Drug group `$\bar{x}_{Diet+Drug} = 179$`

`$SST = 3170$`

`$SSTR = 67.6$`

`$SSE = 3170 - 67.6 = 3102.4$`

`$MSTR = \dfrac{SSTR}{k - 1} = \dfrac{67.6}{4-1} = \dfrac{67.6}{3} = 22.5$`

`$MSE = \dfrac{SSE}{n - k} = \dfrac{3102.4}{20 - 4} = \dfrac{3102.4}{16} = 193.9$`

`$F = \dfrac{MSTR}{MSE} = \dfrac{22.5}{193.9} = 0.116$`

`$p-value = 0.949$` so we are unable to reject the null hypothesis; these data suggest that there is no difference in mean systolic blood pressure between the treatment groups.

---

### Does Pre-surgical fitness influence recovery times?

Sample of 24 males aged 18-30 underwent corrective knee surgery. Investigating relationship between prior fitness level and number of days needed for successful physical therapy.

| Below Average | Average | Above Average |
| :-: | :-: | :-: |
| 29 | 30 | 26 |
| 42 | 35 | 32 |
| 38 | 39 | 21 |
| 40 | 28 | 20 |
| 43 | 31 | 23 |
| 40 | 31 | 22 |
| 30 | 29 | -- |
| 42 | 35 | -- |
| -- | 29 | -- |
| -- | 33 | -- |

> What do you conclude?

---

# .fancy[.heat[ Two-way ANOVA (aka Factorial ANOVA) ]]

---

### Examining the Influence of Two Independent Variables

We may have more than one factor we want to consider

Let us assume we have TWO factors (i.e., independent variables)

For example, say we are interested in looking at how first-grade students learn under varying conditions of temperature and humidity.

So we have an outcome (Quiz Score) and two independent variables:

(1) Humidity, and  
(2) Temperature

How might we test whether

(1) Humidity and/or Temperature influence learning as measured by a quiz score, and  
(2) If the effects of one independent variable are constant across the values of the other independent variable? For example, does high temperature have the same impact on learning when temperature is the highest as when temperature is the lowest?

---

### An Example: Temperature, Humidity, and Learning

| | `$70^0$` |  `$80^0$` | `$90^0$` | |
| :-- | :-- | :-- | :-- | :-- |
| Low Humidity | `$\bar{x} = 85$` | `$\bar{x} = 80$` | `$\bar{x} = 75$` | `$\bar{x}_{low} = 80$` |
| High Humidity | `$\bar{x} = 75$` | `$\bar{x} = 70$` | `$\bar{x} = 65$` | `$\bar{x}_{high} = 70$` |
| | `$\bar{x}_{70}=80$` | `$\bar{x}_{80} = 75$` | `$\bar{x}_{90}=70$` | |

* _Main Effect of Factor A (Humidity)_: Difference between means for high and low humidity
* _Main Effect for Factor B (Temperature)_: Difference between means for `$70^0$`, `$80^0$`, and `$90^0$` temperature

* Null Hypothesis for testing effects of Factor A: `$H_0: \mu_{A1} = \mu_{A2}$`
* Null Hypothesis for testing effects of Factor B: `$H_0: \mu_{B1} = \mu_{B2} = \mu_{B3}$`  
* Note how the mean score drops as Temperature increases ... by exactly `$5$`
* Note how the mean score drops as Humidity rises
	* At `$70^0$` there is a difference of `$10$` between Low/High humidity
	* At `$80^0$` there is a difference of `$10$` between Low/High humidity
	* At `$90^0$` there is a difference of `$10$` between Low/High humidity

---

### A Tweak ...

| | `$70^0$` |  `$80^0$` | `$90^0$` | |
| :-- | :-- | :-- | :-- | :-- |
| Low Humidity | `$\bar{x} = 80$` | `$\bar{x} = 80$` | `$\bar{x} = 80$` | `$\bar{x}_{low} = 80$` |
| High Humidity | `$\bar{x} = 80$` | `$\bar{x} = 70$` | `$\bar{x} = 60$` | `$\bar{x}_{high} = 70$` |
| | `$\bar{x}_{70}=80$` | `$\bar{x}_{80} = 75$` | `$\bar{x}_{90}=70$` | |

* Note how the mean score drops as Temperature increases

* Note how the mean score drops as Humidity rises
	* At `$70^0$` there is a difference of `$0$` between Low/High humidity
	* At `$80^0$` there is a difference of `$10$` between Low/High humidity
	* At `$90^0$` there is a difference of `$20$` between Low/High humidity

> At Low humidity, raising the temperature has no impact

> At High humidity, raising the temperature has an increasing impact

---

#### No Interaction

<img src="anova_files/figure-html/unnamed-chunk-15-1.svg" width="100%" style="display: block; margin: auto;" />
]

#### An Interaction
<img src="anova_files/figure-html/unnamed-chunk-16-1.svg" width="100%" style="display: block; margin: auto;" />
]

> **An interaction:** When the effect of a particular value of one factor (first independent variable) depends upon the value of the other factor (second independent variable)

---

### Hypotheses for Two-Way (aka Factorial) ANOVA

There is no main effect of Diet. That is, 
* `$H_0$`: `$\mu_{noDiet} = \mu_{yesDiet}$` 
* `$H_1$`: `$\mu_{noDiet} \neq \mu_{yesDiet}$`

There is no main effect of Drug. That is, 
* `$H_0$`: `$\mu_{noDrug} = \mu_{yesDrug}$` 
* `$H_1$`: `$\mu_{noDrug} \neq \mu_{yesDrug}$`

There is no interaction effect between Diet and Drug. That is, 
* `$H_0$`: `$\mu_{noDrug.noDiet} = \mu_{noDrug.yesDiet} = \mu_{yesDrug.noDiet} = \mu_{yesDrug.yesDiet}$` 
* `$H_1$`: `$\mu_{noDrug.noDiet} \neq \mu_{noDrug.yesDiet} \neq \mu_{yesDrug.noDiet} \neq \mu_{yesDrug.yesDiet}$`

---

### The Drug and Diet Example Modified

> Does Diet have a direct effect? What about Drug? Is there an interaction effect of Diet and Drug?

---

### Pedagogy, Subject, and Learning

<table>
 <thead>
  <tr>
   <th style="text-align:right;"> Statistics/Standard </th>
   <th style="text-align:right;"> English/Standard </th>
   <th style="text-align:right;"> History/Standard </th>
   <th style="text-align:right;"> Statistics/Computer </th>
   <th style="text-align:right;"> English/Computer </th>
   <th style="text-align:right;"> History/Computer </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:right;"> 44 </td>
   <td style="text-align:right;"> 47 </td>
   <td style="text-align:right;"> 46 </td>
   <td style="text-align:right;"> 53 </td>
   <td style="text-align:right;"> 13 </td>
   <td style="text-align:right;"> 45 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 18 </td>
   <td style="text-align:right;"> 37 </td>
   <td style="text-align:right;"> 21 </td>
   <td style="text-align:right;"> 42 </td>
   <td style="text-align:right;"> 10 </td>
   <td style="text-align:right;"> 36 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 48 </td>
   <td style="text-align:right;"> 42 </td>
   <td style="text-align:right;"> 40 </td>
   <td style="text-align:right;"> 49 </td>
   <td style="text-align:right;"> 16 </td>
   <td style="text-align:right;"> 41 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 32 </td>
   <td style="text-align:right;"> 42 </td>
   <td style="text-align:right;"> 30 </td>
   <td style="text-align:right;"> 51 </td>
   <td style="text-align:right;"> 11 </td>
   <td style="text-align:right;"> 35 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 35 </td>
   <td style="text-align:right;"> 39 </td>
   <td style="text-align:right;"> 29 </td>
   <td style="text-align:right;"> 47 </td>
   <td style="text-align:right;"> 16 </td>
   <td style="text-align:right;"> 38 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 27 </td>
   <td style="text-align:right;"> 33 </td>
   <td style="text-align:right;"> 20 </td>
   <td style="text-align:right;"> 34 </td>
   <td style="text-align:right;"> 6 </td>
   <td style="text-align:right;"> 33 </td>
  </tr>
</tbody>
</table>