# 9 HT3 Hypothesis Testing: t-tests

For more information on t-tests, see here.

```
library(tidyverse) # our main collection of functions
library(tidylog) # prints additional output from the tidyverse commands - load after tidyverse
library(haven) # allows us to load .dta (Stata specific) files
library(here) # needed to navigate to folders and files in a project
library(skimr) # allows us to get an overview over the data quickly
```

## 9.2 Check the data - what is it about?

```
df %>%
skim()
```

Name | Piped data |

Number of rows | 2102 |

Number of columns | 94 |

_______________________ | |

Column type frequency: | |

character | 90 |

numeric | 4 |

________________________ | |

Group variables | None |

**Variable type: character**

skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
---|---|---|---|---|---|---|---|

region | 0 | 1.00 | 1 | 1 | 0 | 4 | 0 |

gender | 0 | 1.00 | 1 | 1 | 0 | 2 | 0 |

relguide | 15 | 0.99 | 1 | 1 | 0 | 4 | 0 |

pray | 11 | 0.99 | 1 | 1 | 0 | 6 | 0 |

relattend | 5 | 1.00 | 1 | 1 | 0 | 5 | 0 |

denom | 7 | 1.00 | 1 | 1 | 0 | 5 | 0 |

orientself | 39 | 0.98 | 1 | 1 | 0 | 3 | 0 |

orientknow | 42 | 0.98 | 1 | 1 | 0 | 2 | 0 |

age | 38 | 0.98 | 2 | 2 | 0 | 75 | 0 |

marstat | 14 | 0.99 | 1 | 1 | 0 | 6 | 0 |

education | 10 | 1.00 | 1 | 2 | 0 | 18 | 0 |

union | 11 | 0.99 | 1 | 1 | 0 | 2 | 0 |

income | 806 | 0.62 | 1 | 2 | 0 | 25 | 0 |

class | 677 | 0.68 | 1 | 1 | 0 | 3 | 0 |

ethnic | 13 | 0.99 | 2 | 3 | 0 | 7 | 0 |

gunown | 45 | 0.98 | 1 | 1 | 0 | 2 | 0 |

efficacy1a | 1058 | 0.50 | 1 | 1 | 0 | 5 | 0 |

efficacy1b | 1061 | 0.50 | 1 | 1 | 0 | 5 | 0 |

efficacy1c | 1063 | 0.49 | 1 | 1 | 0 | 5 | 0 |

efficacy1d | 1062 | 0.49 | 1 | 1 | 0 | 5 | 0 |

efficacy2a | 1045 | 0.50 | 1 | 1 | 0 | 5 | 0 |

efficacy2b | 1048 | 0.50 | 1 | 1 | 0 | 5 | 0 |

efficacy2c | 1051 | 0.50 | 1 | 1 | 0 | 5 | 0 |

efficacy2d | 1050 | 0.50 | 1 | 1 | 0 | 5 | 0 |

ideology | 617 | 0.71 | 1 | 1 | 0 | 7 | 0 |

partyid3 | 29 | 0.99 | 1 | 1 | 0 | 3 | 0 |

partystrength | 824 | 0.61 | 1 | 1 | 0 | 2 | 0 |

partylean | 1313 | 0.38 | 1 | 1 | 0 | 3 | 0 |

partyid7 | 48 | 0.98 | 1 | 1 | 0 | 7 | 0 |

taxes | 18 | 0.99 | 1 | 1 | 0 | 7 | 0 |

milspend | 24 | 0.99 | 1 | 1 | 0 | 7 | 0 |

otherspend | 41 | 0.98 | 1 | 1 | 0 | 7 | 0 |

socialsec | 23 | 0.99 | 1 | 1 | 0 | 7 | 0 |

gradtax | 44 | 0.98 | 1 | 1 | 0 | 3 | 0 |

servespend | 202 | 0.90 | 1 | 1 | 0 | 7 | 0 |

biggov | 27 | 0.99 | 1 | 1 | 0 | 2 | 0 |

govmarket | 45 | 0.98 | 1 | 1 | 0 | 2 | 0 |

govsize | 42 | 0.98 | 1 | 1 | 0 | 2 | 0 |

cappun | 169 | 0.92 | 1 | 1 | 0 | 4 | 0 |

gunbuy | 23 | 0.99 | 1 | 1 | 0 | 3 | 0 |

gaymarriage | 55 | 0.97 | 1 | 1 | 0 | 4 | 0 |

immigration | 5 | 1.00 | 1 | 1 | 0 | 3 | 0 |

immjobs | 20 | 0.99 | 1 | 1 | 0 | 4 | 0 |

abortion | 1065 | 0.49 | 1 | 1 | 0 | 5 | 0 |

equalopp | 1 | 1.00 | 1 | 1 | 0 | 5 | 0 |

isolationism | 59 | 0.97 | 1 | 1 | 0 | 2 | 0 |

iraq | 61 | 0.97 | 1 | 1 | 0 | 2 | 0 |

torture | 34 | 0.98 | 1 | 1 | 0 | 7 | 0 |

thermrush | 512 | 0.76 | 1 | 3 | 0 | 27 | 0 |

thermdem | 52 | 0.98 | 1 | 3 | 0 | 30 | 0 |

thermgop | 59 | 0.97 | 1 | 3 | 0 | 27 | 0 |

thermbush | 9 | 1.00 | 1 | 3 | 0 | 27 | 0 |

thermobama | 12 | 0.99 | 1 | 3 | 0 | 22 | 0 |

thermmccain | 9 | 1.00 | 1 | 3 | 0 | 27 | 0 |

thermbiden | 239 | 0.89 | 1 | 3 | 0 | 24 | 0 |

thermpalin | 120 | 0.94 | 1 | 3 | 0 | 28 | 0 |

thermclinton | 17 | 0.99 | 1 | 3 | 0 | 22 | 0 |

thermhispanic | 52 | 0.98 | 1 | 3 | 0 | 25 | 0 |

thermfund | 216 | 0.90 | 1 | 3 | 0 | 28 | 0 |

thermcatholic | 56 | 0.97 | 1 | 3 | 0 | 18 | 0 |

thermfem | 130 | 0.94 | 1 | 3 | 0 | 25 | 0 |

thermfed | 42 | 0.98 | 1 | 3 | 0 | 26 | 0 |

thermjews | 102 | 0.95 | 1 | 3 | 0 | 20 | 0 |

thermliberal | 147 | 0.93 | 1 | 3 | 0 | 24 | 0 |

thermmiddle | 37 | 0.98 | 1 | 3 | 0 | 22 | 0 |

thermunion | 81 | 0.96 | 1 | 3 | 0 | 25 | 0 |

thermpoor | 46 | 0.98 | 1 | 3 | 0 | 21 | 0 |

thermmilitary | 25 | 0.99 | 1 | 3 | 0 | 25 | 0 |

thermbig | 39 | 0.98 | 1 | 3 | 0 | 26 | 0 |

thermwelfare | 56 | 0.97 | 1 | 3 | 0 | 25 | 0 |

thermconserv | 112 | 0.95 | 1 | 3 | 0 | 25 | 0 |

thermworking | 16 | 0.99 | 1 | 3 | 0 | 22 | 0 |

thermenviron | 91 | 0.96 | 1 | 3 | 0 | 26 | 0 |

thermscotus | 63 | 0.97 | 1 | 3 | 0 | 26 | 0 |

thermgay | 61 | 0.97 | 1 | 3 | 0 | 26 | 0 |

thermasian | 91 | 0.96 | 1 | 3 | 0 | 19 | 0 |

thermcongress | 48 | 0.98 | 1 | 3 | 0 | 23 | 0 |

thermblack | 47 | 0.98 | 1 | 3 | 0 | 21 | 0 |

thermsouth | 99 | 0.95 | 1 | 3 | 0 | 24 | 0 |

thermimmigrant | 54 | 0.97 | 1 | 3 | 0 | 25 | 0 |

thermrich | 59 | 0.97 | 1 | 3 | 0 | 23 | 0 |

thermwhite | 50 | 0.98 | 1 | 3 | 0 | 18 | 0 |

thermisrael | 126 | 0.94 | 1 | 3 | 0 | 22 | 0 |

thermmuslim | 130 | 0.94 | 1 | 3 | 0 | 21 | 0 |

thermhindu | 260 | 0.88 | 1 | 3 | 0 | 24 | 0 |

thermchristian | 33 | 0.98 | 1 | 3 | 0 | 23 | 0 |

thermatheist | 133 | 0.94 | 1 | 3 | 0 | 24 | 0 |

turnout | 0 | 1.00 | 1 | 1 | 0 | 2 | 0 |

presvote | 538 | 0.74 | 1 | 1 | 0 | 3 | 0 |

housevote | 791 | 0.62 | 1 | 1 | 0 | 4 | 0 |

**Variable type: numeric**

skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|

CASE | 0 | 1 | 1165.05 | 672.43 | 1.00 | 578.25 | 1166.50 | 1747.75 | 2323.0 | ▇▇▇▇▇ |

weight | 0 | 1 | 1.00 | 0.75 | 0.17 | 0.42 | 0.74 | 1.31 | 3.7 | ▇▃▂▁▁ |

state | 0 | 1 | 25.35 | 15.28 | 1.00 | 10.00 | 25.00 | 41.00 | 50.0 | ▇▃▃▆▇ |

children | 5 | 1 | 0.78 | 1.21 | 0.00 | 0.00 | 0.00 | 1.00 | 11.0 | ▇▁▁▁▁ |

## 9.3 Perform a one-sample t-test: test whether the mean of thermmilitary is higher than 50

```
df %>%
pull(thermmilitary) %>%
t.test(mu = 50)
```

```
One Sample t-test
data: .
t = 65.007, df = 2076, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 50
95 percent confidence interval:
79.08470 80.89411
sample estimates:
mean of x
79.98941
```

Reject the null that people do not have a opinion of the military at the 5% level. So yes, people have a good opinion of the military (largely in favour (above 50)).

### 9.3.1 Plotting t-tests

As a little extra, I will now also show you how to plot the result of a t-test. We first need to install an additional package:

`install.packages("webr")`

With this new package, we can simply use the `plot`

function:

The blue dot here shows us where the t-statistic value of our sample is, compared to our assumed distribution.

## 9.4 Perform the same test, checking that it is higher that 80, what does it say?

```
df %>%
pull(thermmilitary) %>%
t.test(mu = 80)
```

```
One Sample t-test
data: .
t = -0.02296, df = 2076, p-value = 0.9817
alternative hypothesis: true mean is not equal to 80
95 percent confidence interval:
79.08470 80.89411
sample estimates:
mean of x
79.98941
```

Now we fail to reject the null. \(H_0: \mu=80\)

Again we plot it:

## 9.5 Perform a two-sample t-test: Thermfed and Gunown

Is there a difference between the two groups? Admittedly, this is a bit easier without using the pipe logic:

`t.test(df$thermfed, df$gunown, paired = FALSE)`

We could also use the the exposition `%$%`

operator from the `magrittr`

package (but this is rarely used):

```
Attaching package: 'magrittr'
```

```
The following object is masked from 'package:purrr':
set_names
```

```
The following object is masked from 'package:tidyr':
extract
```

```
df %$%
t.test(thermfed, gunown, paired = FALSE)
```

```
Welch Two Sample t-test
data: thermfed and gunown
t = 93.545, df = 2084.1, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
47.16457 49.18446
sample estimates:
mean of x mean of y
52.007767 3.833252
```

But we can still do it with our known tools:

`summarise: now one row and one column, ungrouped`

```
[[1]]
Welch Two Sample t-test
data: thermfed and gunown
t = 93.545, df = 2084.1, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
47.16457 49.18446
sample estimates:
mean of x mean of y
52.007767 3.833252
```

Yes. People who own guns have a lower opinion of the federal government.

## 9.6 Perform a two-sample test: Democratic and Republican Party

Test whether the mean support for the democratic party is the same as the mean support for the republican party

`summarise: now one row and one column, ungrouped`

```
[[1]]
Welch Two Sample t-test
data: thermdem and thermgop
t = 22.029, df = 4090.9, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
16.32829 19.51864
sample estimates:
mean of x mean of y
62.76634 44.84288
```

Reject the null of no difference in support. Average support higher for the democrats

## 9.7 Now restrict this test to Catholics only, what do you observe?

Find out where we have the information regarding faith:

`lapply(df, function(x) attributes(x)$label)`

```
$CASE
[1] "Case ID"
$weight
NULL
$state
NULL
$region
[1] "Region of country"
$gender
[1] "Gender"
$relguide
[1] "Does religion provide guidance in life?"
$pray
[1] "How often does respondent pray?"
$relattend
[1] "How often does respondent attend religious services?"
$denom
[1] "Religious affiliation"
$orientself
[1] "Sexual orientation"
$orientknow
[1] "Know gay, lesbian, bisexual family or friends?"
$age
[1] "Age"
$marstat
[1] "Marital status"
$children
[1] "Number of children under 18 in household"
$education
[1] "Highest grade of school or year of college completed"
$union
[1] "Anyone in household belong to a labor union?"
$income
[1] "Household income"
$class
[1] "Self-identification as working or middle class"
$ethnic
[1] "Racial or ethnic identification"
$gunown
[1] "Does respondent have a gun in his or her home or garage?"
$efficacy1a
[1] "Politics/govt too complicated to understand"
$efficacy1b
[1] "Good understanding of political issues"
$efficacy1c
[1] "Public officials don't care what people like me think"
$efficacy1d
[1] "Have no say about what govt does"
$efficacy2a
[1] "Politics/govt too complicated to understand"
$efficacy2b
[1] "Good understanding of political issues"
$efficacy2c
[1] "How much do public officials care what people like me think"
$efficacy2d
[1] "How much can people like you affect what the government does"
$ideology
[1] "Liberal-conservative self-placement"
$partyid3
[1] "Party self-identification - 3 point scale"
$partystrength
[1] "Strength of party identification"
$partylean
[1] "Party leanings"
$partyid7
[1] "Party self-identification - 7 point scale)"
$taxes
[1] "Reduce deficit by raising taxes"
$milspend
[1] "Reduce deficit by cutting military spending"
$otherspend
[1] "Reduce deficit by cutting nonmilitary spending"
$socialsec
[1] "Invest social security in stocks and bonds"
$gradtax
[1] "Statement best agrees with respondent about graduated tax"
$servespend
[1] "Position on services vs. spending"
$biggov
[1] "Govt bigger because too involved OR bigger problems?"
$govmarket
[1] "Need strong govt for complex problems OR free market?"
$govsize
[1] "Less govt better OR more that govt should be doing"
$cappun
[1] "Favor/oppose death penalty"
$gunbuy
[1] "Should fed govt make it more difficult to buy a gun?"
$gaymarriage
[1] "Position on gay marriage"
$immigration
[1] "How important is controlling illegal immigration?"
$immjobs
[1] "How likely that immigration will take away jobs?"
$abortion
[1] "Abortion - self placement"
$equalopp
[1] "Society should make sure everyone has equal opportunity"
$isolationism
[1] "This country would be better off if we just stayed home"
$iraq
[1] "Was Iraq war worth the cost"
$torture
[1] "Favor-oppose torture for suspected terrorists"
$thermrush
[1] "Feeling Thermometer: Rush Limbaugh"
$thermdem
[1] "Feeling Thermometer: Democratic party"
$thermgop
[1] "Feeling Thermometer: Republican party"
$thermbush
[1] "Feeling Thermometer: George W. Bush"
$thermobama
[1] "Feeling Thermometer: Barack Obama"
$thermmccain
[1] "Feeling Thermometer: John McCain"
$thermbiden
[1] "Feeling Thermometer: Joe Biden"
$thermpalin
[1] "Feeling Thermometer: Sarah Palin"
$thermclinton
[1] "Feeling Thermometer: Hillary Clinton"
$thermhispanic
[1] "Feeling Thermometer: Hispanics"
$thermfund
[1] "Feeling Thermometer: Christian fundamentalists"
$thermcatholic
[1] "Feeling Thermometer: Catholics"
$thermfem
[1] "Feeling Thermometer: Feminists"
$thermfed
[1] "Feeling Thermometer: Federal Government in Washington"
$thermjews
[1] "Feeling Thermometer: Jews"
$thermliberal
[1] "Feeling Thermometer: Liberals"
$thermmiddle
[1] "Feeling Thermometer: Middle class people"
$thermunion
[1] "Feeling Thermometer: Labor unions"
$thermpoor
[1] "Feeling Thermometer: Poor people"
$thermmilitary
[1] "Feeling Thermometer: The military"
$thermbig
[1] "Feeling Thermometer: Big business"
$thermwelfare
[1] "Feeling Thermometer: People on welfare"
$thermconserv
[1] "Feeling Thermometer: Conservatives"
$thermworking
[1] "Feeling Thermometer: Working class people"
$thermenviron
[1] "Feeling Thermometer: Environmentalists"
$thermscotus
[1] "Feeling Thermometer: The Supreme Court of the United States"
$thermgay
[1] "Feeling Thermometer: Gays and lesbians (homosexuals)"
$thermasian
[1] "Feeling Thermometer: Asian Americans"
$thermcongress
[1] "Feeling Thermometer: Congress"
$thermblack
[1] "Feeling Thermometer: Blacks"
$thermsouth
[1] "Feeling Thermometer: Southerners"
$thermimmigrant
[1] "Feeling Thermometer: Illegal immigrants"
$thermrich
[1] "Feeling Thermometer: Rich people"
$thermwhite
[1] "Feeling Thermometer: White people"
$thermisrael
[1] "Feeling Thermometer: Israel"
$thermmuslim
[1] "Feeling Thermometer: Muslims"
$thermhindu
[1] "Feeling Thermometer: Hindus"
$thermchristian
[1] "Feeling Thermometer: Christians"
$thermatheist
[1] "Feeling Thermometer: Atheists"
$turnout
[1] "Did respondent vote in November 2008"
$presvote
[1] "Vote in 2008 presidential election"
$housevote
[1] "Party of respondent's vote for House"
```

So we see that the variable `denom`

is “Religious affiliation.” Let’s check which numeric value represents catholics:

`print_labels(df$denom)`

```
Labels:
value label
-9 Refused
-8 DK
-1 Not asked
1 Protestant
2 Catholic
3 Jewish
7 Other
9 None
```

So we know that catholics are `denom == 2`

. Now we can use this:

```
df %>%
filter(denom == 2) %>%
summarise(ttest = list(t.test(x = thermdem, y = thermgop, paired = FALSE))) %>%
pull(ttest)
```

`filter: removed 1,625 rows (77%), 477 rows remaining`

`summarise: now one row and one column, ungrouped`

```
[[1]]
Welch Two Sample t-test
data: thermdem and thermgop
t = 11.11, df = 929, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
14.78362 21.12716
sample estimates:
mean of x mean of y
65.21075 47.25536
```

We can now **rejct the null of no difference in support**. Average support higher for the democrats, although stronger support for both camps if look at their individual means, we reject the null of no difference in support. Average support higher for the democrats.

## 9.8 Additional restrictions

We now add restrictions to owning a gun, attending church more than twice a month (relattend), and being white (ethnic). What do you find?

```
df %>%
filter(denom == 2 & gunown == 1 & ethnic == 50 & relattend == 1) %>%
summarise(ttest = list(t.test(x = thermdem, y = thermgop, paired = FALSE))) %>%
pull(ttest)
```

`filter: removed 2,089 rows (99%), 13 rows remaining`

`summarise: now one row and one column, ungrouped`

```
[[1]]
Welch Two Sample t-test
data: thermdem and thermgop
t = 1.1508, df = 20.02, p-value = 0.2634
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-9.479355 32.812688
sample estimates:
mean of x mean of y
56.66667 45.00000
```

We now fail to reject the null that the mean support are the same - this is mainly due to the high variance due to the low number of observations (see confidence intervals).