# 12 HT6 Regression: Interactions

```
library(tidyverse) # our main collection of functions
library(tidylog) # prints additional output from the tidyverse commands - load after tidyverse
library(haven) # allows us to load .dta (Stata specific) files
library(here) # needed to navigate to folders and files in a project
library(car) # needed for the linear hypothesis testing after a regression
library(margins) # to ease interpretation
```

## 12.2 Run a simple regression of thermimmigrant with gender

To run a simple regression, we again use the `lm()`

command for a linear regression (lm stands for linear regression). As with most datasets in this course, we can take the `gender`

variable just as it comes from Stata - where the variable is labelled and has a numeric value for its labels. Doing so gives us the correct model:

```
df %>%
lm(thermimmigrant ~ gender, data = .) %>%
summary
```

```
Call:
lm(formula = thermimmigrant ~ gender, data = .)
Residuals:
Min 1Q Median 3Q Max
-45.885 -15.885 4.115 17.679 57.679
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 42.3206 0.9067 46.677 < 2e-16 ***
gender 3.5647 1.2052 2.958 0.00314 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 27.03 on 2046 degrees of freedom
(54 observations deleted due to missingness)
Multiple R-squared: 0.004257, Adjusted R-squared: 0.003771
F-statistic: 8.748 on 1 and 2046 DF, p-value: 0.003136
```

But now we don’t really know what `0`

and `1`

correspond to! This is why I’d recommend always converting a categorical variable to a `factor`

using the `as_factor()`

command from the `haven`

package, which we have used to import Stata datasets each week.
Now if we run the same command, then we get the same coefficient but are certain that `0`

= Male and `1`

= Female. Indeed, naming the variable `female`

to start with would resolve that problem altogether.

```
Call:
lm(formula = thermimmigrant ~ gender, data = .)
Residuals:
Min 1Q Median 3Q Max
-45.885 -15.885 4.115 17.679 57.679
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 42.3206 0.9067 46.677 < 2e-16 ***
genderFemale 3.5647 1.2052 2.958 0.00314 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 27.03 on 2046 degrees of freedom
(54 observations deleted due to missingness)
Multiple R-squared: 0.004257, Adjusted R-squared: 0.003771
F-statistic: 8.748 on 1 and 2046 DF, p-value: 0.003136
```

## 12.3 Run a simple regression of thermimmigrant with gunown

The problem above becomes clearer yet again when we use the `gunown`

variable:

```
df %>%
pull(gunown) %>% # we use pull to extract the column as a vector, which print_labels needs - equiv. to print_labels(df$gunown)
print_labels()
```

```
Labels:
value label
-9 Refused
-8 DK
1 Yes
5 No
```

Now if we were to use the variable “as is,” we’d still get a result - but it would be completely non-sensical because the `No`

variable has a numeric value of `5`

where it should be `0`

and `-8`

and `-9`

should not have a numeric interpretation at all:

```
df %>%
lm(thermimmigrant ~ gunown, data = .) %>%
summary
```

```
Call:
lm(formula = thermimmigrant ~ gunown, data = .)
Residuals:
Min 1Q Median 3Q Max
-47.421 -17.421 2.579 12.657 62.657
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 34.8234 1.3805 25.226 < 2e-16 ***
gunown 2.5195 0.3264 7.718 1.84e-14 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 26.71 on 2005 degrees of freedom
(95 observations deleted due to missingness)
Multiple R-squared: 0.02886, Adjusted R-squared: 0.02837
F-statistic: 59.57 on 1 and 2005 DF, p-value: 1.845e-14
```

We therefore first convert the variable to a factor and the quickly use the `distinct()`

command to check which observations are actually present:

`distinct: removed 2,099 rows (>99%), 3 rows remaining`

```
# A tibble: 3 x 1
gunown
<fct>
1 Yes
2 No
3 <NA>
```

Now we only have `Yes`

and `No`

observations, alongside the missing values! This allows us to estimate a useful model now:

```
Call:
lm(formula = thermimmigrant ~ gunown, data = .)
Residuals:
Min 1Q Median 3Q Max
-47.421 -17.421 2.579 12.657 62.657
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 37.343 1.095 34.098 < 2e-16 ***
gunownNo 10.078 1.306 7.718 1.84e-14 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 26.71 on 2005 degrees of freedom
(95 observations deleted due to missingness)
Multiple R-squared: 0.02886, Adjusted R-squared: 0.02837
F-statistic: 59.57 on 1 and 2005 DF, p-value: 1.845e-14
```

If you wanted to switch the dummy base around, you could use the `relevel()`

command - you can see that both results are the same!

```
df %>%
mutate(gunown = as_factor(gunown),
gunown = relevel(gunown, "No")) %>%
lm(thermimmigrant ~ gunown, data = .) %>%
summary
```

```
Call:
lm(formula = thermimmigrant ~ gunown, data = .)
Residuals:
Min 1Q Median 3Q Max
-47.421 -17.421 2.579 12.657 62.657
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 47.4207 0.7109 66.703 < 2e-16 ***
gunownYes -10.0778 1.3057 -7.718 1.84e-14 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 26.71 on 2005 degrees of freedom
(95 observations deleted due to missingness)
Multiple R-squared: 0.02886, Adjusted R-squared: 0.02837
F-statistic: 59.57 on 1 and 2005 DF, p-value: 1.845e-14
```

## 12.4 Add socio-economic controls

As we now want to add socio-economic controls, we want to make sure we deal with all dummy variables as well! We therefore convert them to factors again!

The `income`

and `education`

variables are not ideal - education is maxed out at `17`

years (all the hard years of a PhD hidden) and the `income`

variable is not divided into equal intervals. At this stage, we don’t deal with them further, but we just leave them as their numerical values as both are ordered in an ordinal sense (the numeric values are ordered) - but ideally we would use the full data, if we had it available.

```
df %>%
mutate(gunown = as_factor(gunown),
gender = as_factor(gender),
orientself = as_factor(orientself)) %>%
lm(thermimmigrant ~ gender + gunown + orientself + children + education + income, data = .) %>%
summary
```

```
Call:
lm(formula = thermimmigrant ~ gender + gunown + orientself +
children + education + income, data = .)
Residuals:
Min 1Q Median 3Q Max
-55.446 -17.556 2.042 16.036 63.912
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 35.0699 4.3373 8.086 1.47e-15 ***
genderFemale 3.5826 1.4808 2.419 0.01569 *
gunownNo 9.3904 1.6082 5.839 6.71e-09 ***
orientselfHomosexual 2.8213 5.2348 0.539 0.59002
orientselfBisexual 4.7749 5.6770 0.841 0.40046
children 1.5955 0.5857 2.724 0.00654 **
education 0.3235 0.3321 0.974 0.33006
income -0.2955 0.1482 -1.993 0.04646 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 25.73 on 1230 degrees of freedom
(864 observations deleted due to missingness)
Multiple R-squared: 0.0514, Adjusted R-squared: 0.046
F-statistic: 9.52 on 7 and 1230 DF, p-value: 1.483e-11
```

We see that sexual orientation and education do not seem to explain variation in the feeling towards immigrants. However, gender, owning a gun, having children and higher incomes seem to explain some variation.

## 12.5 We now add Religion

Because we already looked at the religion variable a few weeks ago, we know that they are not ordinal - so we need to convert them to a factor (just like all others we did above):

```
df %>%
mutate(denom = as_factor(denom)) %>%
# same as above
mutate(gunown = as_factor(gunown),
gender = as_factor(gender),
orientself = as_factor(orientself)) %>%
lm(thermimmigrant ~ gender + gunown + orientself + children + education + income + denom, data = .) %>%
summary
```

```
Call:
lm(formula = thermimmigrant ~ gender + gunown + orientself +
children + education + income + denom, data = .)
Residuals:
Min 1Q Median 3Q Max
-64.448 -16.167 1.902 15.383 64.825
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 30.3971 4.3364 7.010 3.94e-12 ***
genderFemale 3.5636 1.4554 2.448 0.01449 *
gunownNo 7.9780 1.5841 5.036 5.46e-07 ***
orientselfHomosexual 2.0767 5.1345 0.404 0.68595
orientselfBisexual 5.4529 5.5565 0.981 0.32661
children 1.4905 0.5740 2.597 0.00953 **
education 0.5165 0.3279 1.575 0.11549
income -0.3722 0.1451 -2.564 0.01046 *
denomCatholic 14.2423 1.7694 8.049 1.96e-15 ***
denomJewish 7.4824 6.4638 1.158 0.24726
denomOther 5.6566 3.6556 1.547 0.12203
denomNone 3.5929 2.0145 1.783 0.07475 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 25.1 on 1224 degrees of freedom
(866 observations deleted due to missingness)
Multiple R-squared: 0.09946, Adjusted R-squared: 0.09137
F-statistic: 12.29 on 11 and 1224 DF, p-value: < 2.2e-16
```

So we see that compared to our reference cateogry `protestant`

, only catholics seem to have a clearly more positive view towards immigrants.

We can again use `relevel()`

to change that reference category e.g. to `None`

:

```
df %>%
mutate(denom = as_factor(denom),
denom = relevel(denom, "None")) %>%
# same as above
mutate(gunown = as_factor(gunown),
gender = as_factor(gender),
orientself = as_factor(orientself)) %>%
lm(thermimmigrant ~ gender + gunown + orientself + children + education + income + denom, data = .) %>%
summary
```

```
Call:
lm(formula = thermimmigrant ~ gender + gunown + orientself +
children + education + income + denom, data = .)
Residuals:
Min 1Q Median 3Q Max
-64.448 -16.167 1.902 15.383 64.825
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 33.9900 4.5565 7.460 1.64e-13 ***
genderFemale 3.5636 1.4554 2.448 0.01449 *
gunownNo 7.9780 1.5841 5.036 5.46e-07 ***
orientselfHomosexual 2.0767 5.1345 0.404 0.68595
orientselfBisexual 5.4529 5.5565 0.981 0.32661
children 1.4905 0.5740 2.597 0.00953 **
education 0.5165 0.3279 1.575 0.11549
income -0.3722 0.1451 -2.564 0.01046 *
denomProtestant -3.5929 2.0145 -1.783 0.07475 .
denomCatholic 10.6494 2.2978 4.635 3.96e-06 ***
denomJewish 3.8895 6.6115 0.588 0.55644
denomOther 2.0637 3.9344 0.525 0.60000
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 25.1 on 1224 degrees of freedom
(866 observations deleted due to missingness)
Multiple R-squared: 0.09946, Adjusted R-squared: 0.09137
F-statistic: 12.29 on 11 and 1224 DF, p-value: < 2.2e-16
```

## 12.6 What if we add ethnicity?

Do you think being catholic really makes are more favourable to immigration? What if you add ethnicity to the model?

We keep our base as having no denomimation and now add ethnicity and define it’s base as “White” (default is “Black”):

```
df %>%
mutate(denom = as_factor(denom),
denom = relevel(denom, "None"),
ethnic = as_factor(ethnic),
ethnic = relevel(ethnic, "White")) %>%
# same as above
mutate(gunown = as_factor(gunown),
gender = as_factor(gender),
orientself = as_factor(orientself)) %>%
lm(thermimmigrant ~ gender + gunown + orientself + children + education + income + denom + ethnic, data = .) %>%
summary
```

```
Call:
lm(formula = thermimmigrant ~ gender + gunown + orientself +
children + education + income + denom + ethnic, data = .)
Residuals:
Min 1Q Median 3Q Max
-74.012 -15.065 1.249 15.746 69.618
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 19.2940 4.3705 4.415 1.10e-05 ***
genderFemale 3.5302 1.3516 2.612 0.009117 **
gunownNo 3.7857 1.5099 2.507 0.012295 *
orientselfHomosexual 1.8841 4.8679 0.387 0.698793
orientselfBisexual 8.3188 5.1677 1.610 0.107709
children 0.3179 0.5384 0.590 0.555013
education 1.2398 0.3102 3.996 6.83e-05 ***
income -0.1870 0.1372 -1.363 0.173058
denomProtestant -5.7132 1.9295 -2.961 0.003127 **
denomCatholic 1.7981 2.2444 0.801 0.423186
denomJewish 4.6333 6.1296 0.756 0.449856
denomOther -2.6382 3.7085 -0.711 0.476974
ethnicBlack 15.2430 1.9371 7.869 7.88e-15 ***
ethnicAsian 7.9853 5.6650 1.410 0.158918
ethnicNative American 17.7684 8.2807 2.146 0.032090 *
ethnicHispanic or Latino 26.6278 1.9417 13.714 < 2e-16 ***
ethnicOther, specify 18.6619 5.1948 3.592 0.000341 ***
ethnicMultiracial 11.9962 4.3617 2.750 0.006042 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 23.24 on 1214 degrees of freedom
(870 observations deleted due to missingness)
Multiple R-squared: 0.2333, Adjusted R-squared: 0.2226
F-statistic: 21.74 on 17 and 1214 DF, p-value: < 2.2e-16
```

This shows a bit more clearly, that catholics do not seem to have a better view towards immigrants - only protestants have a generally more negative view of immigrants than people without a denomination.

With regard to ethnicity, however, we see that Hispanics, Other ethnicities, African Americans and people of multiracial ethnicity have all more positive views of immigrants.

## 12.7 Interaction of Female and gunowner

We start out as we always do, but note that in the regression equation in `lm()`

, we add a term `gender:gunown`

, which creates our interaction variable! We could keep the `gender + gunown`

terms as well, but we don’t have to - so let’s remove them because then we get a bit more output for each possible case (Note: one will be `NA`

because this is going to be perfectly collinear with the intercept).

Because we will need the model below for a hypothesis test, we save the model first as `interaction_model`

!

```
df %>%
mutate(denom = as_factor(denom),
ethnic = as_factor(ethnic)) %>%
# same as above
mutate(gunown = as_factor(gunown),
gunwon = relevel(gunown, "No"),
gender = as_factor(gender),
orientself = as_factor(orientself)) %>%
lm(thermimmigrant ~ orientself + children + education +
income + denom + ethnic + gender:gunown, data = .) %>%
summary
```

```
Call:
lm(formula = thermimmigrant ~ orientself + children + education +
income + denom + ethnic + gender:gunown, data = .)
Residuals:
Min 1Q Median 3Q Max
-74.219 -14.960 1.268 15.648 69.182
Coefficients: (1 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 36.3433 3.9489 9.203 < 2e-16 ***
orientselfHomosexual 2.0051 4.8770 0.411 0.681043
orientselfBisexual 8.3287 5.1695 1.611 0.107412
children 0.3183 0.5385 0.591 0.554655
education 1.2395 0.3103 3.994 6.89e-05 ***
income -0.1892 0.1373 -1.378 0.168457
denomCatholic 7.5161 1.8508 4.061 5.20e-05 ***
denomJewish 10.3125 6.0137 1.715 0.086633 .
denomOther 3.0904 3.4604 0.893 0.371992
denomNone 5.6750 1.9321 2.937 0.003374 **
ethnicAsian -7.2551 5.8369 -1.243 0.214118
ethnicNative American 2.4607 8.4027 0.293 0.769687
ethnicHispanic or Latino 11.4041 2.3540 4.844 1.43e-06 ***
ethnicWhite -15.2127 1.9390 -7.846 9.39e-15 ***
ethnicOther, specify 3.4573 5.3298 0.649 0.516671
ethnicMultiracial -3.1859 4.5985 -0.693 0.488562
genderMale:gunownYes -7.0825 2.0568 -3.443 0.000594 ***
genderFemale:gunownYes -4.4006 2.0390 -2.158 0.031111 *
genderMale:gunownNo -3.9520 1.6466 -2.400 0.016539 *
genderFemale:gunownNo NA NA NA NA
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 23.25 on 1213 degrees of freedom
(870 observations deleted due to missingness)
Multiple R-squared: 0.2335, Adjusted R-squared: 0.2221
F-statistic: 20.53 on 18 and 1213 DF, p-value: < 2.2e-16
```

The interpretation goes as follows: Our base term is always `genderFemale:gunownNo`

, so we compare each other coefficient to Women who do not own guns.

- Women who own guns have a
`-4.4`

unit less favourable view towards immigrants than Women who do not own guns. - Men who own guns have a
`-7`

unit less favourable view towards immigrants than Women who do not own guns. - Men who do not own guns have a
`3.95`

unit less favourable view towards immigrants than Women who do not own guns.

All of these terms are significant at the 5% significance level, but only one is significant at the 1% significance level.

### 12.7.1 Testing this

We can now formally test some hypotheses using an F-test. We can do this using the `linearHypothesis()`

command from the `car`

package, which we loaded above.

The first option for this command is the model we want to test - here we can simply pipe `%>%`

our model to the `linearHypothesis()`

command. Now we need to insert some arguments to the `hypothesis.matrix`

.

To illustrate two different ways we could use those hypothesis tests, we will save two models quickly: the `interaction_model`

and the `interaction_model_full`

. Both models are identical, apart from how they present their results.

The `interaction_model`

uses the variable logic of \(x_1 + x_2 + x_1*x_2\) while interaction_model_full uses the special `:`

operator and adds all different combinations of interactions to the model. Compare the following two outputs:

`interaction_model`

`summary(interaction_model)`

```
Call:
lm(formula = thermimmigrant ~ gender + gunown + orientself +
children + education + income + denom + ethnic + gender:gunown,
data = .)
Residuals:
Min 1Q Median 3Q Max
-74.219 -14.960 1.268 15.648 69.182
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 32.3913 3.9603 8.179 7.15e-16 ***
genderFemale 3.9520 1.6466 2.400 0.01654 *
gunownYes -3.1305 2.1004 -1.490 0.13637
orientselfHomosexual 2.0051 4.8770 0.411 0.68104
orientselfBisexual 8.3287 5.1695 1.611 0.10741
children 0.3183 0.5385 0.591 0.55466
education 1.2395 0.3103 3.994 6.89e-05 ***
income -0.1892 0.1373 -1.378 0.16846
denomCatholic 7.5161 1.8508 4.061 5.20e-05 ***
denomJewish 10.3125 6.0137 1.715 0.08663 .
denomOther 3.0904 3.4604 0.893 0.37199
denomNone 5.6750 1.9321 2.937 0.00337 **
ethnicAsian -7.2551 5.8369 -1.243 0.21412
ethnicNative American 2.4607 8.4027 0.293 0.76969
ethnicHispanic or Latino 11.4041 2.3540 4.844 1.43e-06 ***
ethnicWhite -15.2127 1.9390 -7.846 9.39e-15 ***
ethnicOther, specify 3.4573 5.3298 0.649 0.51667
ethnicMultiracial -3.1859 4.5985 -0.693 0.48856
genderFemale:gunownYes -1.2701 2.8295 -0.449 0.65360
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 23.25 on 1213 degrees of freedom
(870 observations deleted due to missingness)
Multiple R-squared: 0.2335, Adjusted R-squared: 0.2221
F-statistic: 20.53 on 18 and 1213 DF, p-value: < 2.2e-16
```

`interaction_model_full`

`summary(interaction_model_full)`

```
Call:
lm(formula = thermimmigrant ~ orientself + children + education +
income + denom + ethnic + gender:gunown, data = .)
Residuals:
Min 1Q Median 3Q Max
-74.219 -14.960 1.268 15.648 69.182
Coefficients: (1 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 31.9427 4.2872 7.451 1.76e-13 ***
orientselfHomosexual 2.0051 4.8770 0.411 0.68104
orientselfBisexual 8.3287 5.1695 1.611 0.10741
children 0.3183 0.5385 0.591 0.55466
education 1.2395 0.3103 3.994 6.89e-05 ***
income -0.1892 0.1373 -1.378 0.16846
denomCatholic 7.5161 1.8508 4.061 5.20e-05 ***
denomJewish 10.3125 6.0137 1.715 0.08663 .
denomOther 3.0904 3.4604 0.893 0.37199
denomNone 5.6750 1.9321 2.937 0.00337 **
ethnicAsian -7.2551 5.8369 -1.243 0.21412
ethnicNative American 2.4607 8.4027 0.293 0.76969
ethnicHispanic or Latino 11.4041 2.3540 4.844 1.43e-06 ***
ethnicWhite -15.2127 1.9390 -7.846 9.39e-15 ***
ethnicOther, specify 3.4573 5.3298 0.649 0.51667
ethnicMultiracial -3.1859 4.5985 -0.693 0.48856
genderMale:gunownNo 0.4486 2.1079 0.213 0.83150
genderFemale:gunownNo 4.4006 2.0390 2.158 0.03111 *
genderMale:gunownYes -2.6819 2.3236 -1.154 0.24864
genderFemale:gunownYes NA NA NA NA
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 23.25 on 1213 degrees of freedom
(870 observations deleted due to missingness)
Multiple R-squared: 0.2335, Adjusted R-squared: 0.2221
F-statistic: 20.53 on 18 and 1213 DF, p-value: < 2.2e-16
```

Personally, I find the second output of the `interaction_model_full`

more easy to interpret.

Now we can test all different effects:

### 12.7.2 What is the total effect for a female gunowner?

Now we need to insert some arguments to the `hypothesis.matrix`

.
We want to test for a difference between e.g. a woman who owns a gun with e.g. a man who does not own a gun (our base).

To do this using our `interaction_model`

, we need to use a vector that describes the situation of being a female and owning a gun. We then set this equation to 0 using `= 0`

because the case where all of these effects are 0 would coincide with a man not owning a gun (our base). This would look like this: `"1.genderFemale + 1.gunownYes + 1.genderFemale:gunownYes = 0"`

where now we set the `genderFemale`

to 1 to indicate a woman, the `gunown`

to 1 to indicate her owning a gun. The interaction then of course will also be 1:

```
interaction_model %>%
linearHypothesis(hypothesis.matrix = "1.genderFemale + 1.gunownYes + 1.genderFemale:gunownYes = 0")
```

```
Linear hypothesis test
Hypothesis:
genderFemale + gunownYes + genderFemale:gunownYes = 0
Model 1: restricted model
Model 2: thermimmigrant ~ gender + gunown + orientself + children + education +
income + denom + ethnic + gender:gunown
Res.Df RSS Df Sum of Sq F Pr(>F)
1 1214 655705
2 1213 655680 1 24.482 0.0453 0.8315
```

Alternatively, we can use the `interaction_model_full`

model and test explicitly whether the `genderMale:gunownNo`

is different to `genderFemale:gunownYes`

, which is equivalent to the intercept (i.e. identical to 1). This gives us a much easier formula: `"genderMale:gunownNo = 0"`

. Note: we need to use the option `singluar.ok=TRUE`

, because one coefficient in `interaction_model_full`

is not defined (as it is collinear with the intercept):

```
interaction_model_full %>%
linearHypothesis(singular.ok = TRUE,
hypothesis.matrix = "genderMale:gunownNo = 0")
```

```
Linear hypothesis test
Hypothesis:
genderMale:gunownNo = 0
Model 1: restricted model
Model 2: thermimmigrant ~ orientself + children + education + income +
denom + ethnic + gender:gunown
Res.Df RSS Df Sum of Sq F Pr(>F)
1 1214 655705
2 1213 655680 1 24.482 0.0453 0.8315
```

As you can see both outcomes are equivalent: Our linear hypothesis test now tests a Model 1, which is our restricted model, to the model we calculated just before using an F-test (using the F-distribution). Becase our p-value (here called `Pr(>F)`

) is \(p>0.05\), we cannot reject the \(H_0\) of a difference between those two models. We therefore conclude that Women with a gun do not express different values towards immigration than men without a gun.

### 12.7.3 What is the total effect for male gunowners?

Again we compare this with our base category: Men who do not own guns.

```
interaction_model %>%
linearHypothesis(hypothesis.matrix = "0.genderFemale + 1.gunownYes + 0.genderFemale:gunownYes = 0")
```

```
Linear hypothesis test
Hypothesis:
gunownYes = 0
Model 1: restricted model
Model 2: thermimmigrant ~ gender + gunown + orientself + children + education +
income + denom + ethnic + gender:gunown
Res.Df RSS Df Sum of Sq F Pr(>F)
1 1214 656881
2 1213 655680 1 1200.8 2.2214 0.1364
```

Or we could write this more succintly, using just `"1.gunownYes = 0"`

beacuse everything else is 0 by default (and `Male`

is 0).

Of course using `interaction_model_full`

we can use this:

```
interaction_model_full %>%
linearHypothesis(singular.ok = TRUE,
hypothesis.matrix = "genderMale:gunownYes = genderMale:gunownNo")
```

```
Linear hypothesis test
Hypothesis:
- genderMale:gunownNo + genderMale:gunownYes = 0
Model 1: restricted model
Model 2: thermimmigrant ~ orientself + children + education + income +
denom + ethnic + gender:gunown
Res.Df RSS Df Sum of Sq F Pr(>F)
1 1214 656881
2 1213 655680 1 1200.8 2.2214 0.1364
```

### 12.7.4 What is the total effect for female non-gunowner?

Again we compare this with our base category: Men who do not own guns. Now using this more succintly:

```
interaction_model %>%
linearHypothesis(hypothesis.matrix = "1.genderFemale = 0")
```

```
Linear hypothesis test
Hypothesis:
genderFemale = 0
Model 1: restricted model
Model 2: thermimmigrant ~ gender + gunown + orientself + children + education +
income + denom + ethnic + gender:gunown
Res.Df RSS Df Sum of Sq F Pr(>F)
1 1214 658794
2 1213 655680 1 3113.9 5.7606 0.01654 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

## 12.8 Interaction between education and gunowners

Despite what we mentioned above, we treat education as continuous here.

We follow the same logic as before:

```
df %>%
mutate(denom = as_factor(denom),
ethnic = as_factor(ethnic)) %>%
# same as above
mutate(gunown = as_factor(gunown),
gunown = relevel(gunown, "No"),
gender = as_factor(gender),
orientself = as_factor(orientself)) %>%
lm(thermimmigrant ~ gender + orientself + children + education + gunown +
income + denom + ethnic + education:gunown, data = .) %>%
summary
```

```
Call:
lm(formula = thermimmigrant ~ gender + orientself + children +
education + gunown + income + denom + ethnic + education:gunown,
data = .)
Residuals:
Min 1Q Median 3Q Max
-73.701 -15.181 1.362 15.878 69.247
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 33.7645 4.4578 7.574 7.13e-14 ***
genderFemale 3.5374 1.3520 2.616 0.00900 **
orientselfHomosexual 1.9894 4.8731 0.408 0.68317
orientselfBisexual 8.3078 5.1692 1.607 0.10828
children 0.3217 0.5386 0.597 0.55043
education 1.1498 0.3509 3.277 0.00108 **
gunownYes -8.4467 8.6223 -0.980 0.32746
income -0.1886 0.1372 -1.374 0.16957
denomCatholic 7.5464 1.8518 4.075 4.90e-05 ***
denomJewish 10.6087 6.0319 1.759 0.07887 .
denomOther 3.1743 3.4648 0.916 0.35976
denomNone 5.7339 1.9305 2.970 0.00303 **
ethnicAsian -7.0251 5.8520 -1.200 0.23020
ethnicNative American 2.5568 8.4013 0.304 0.76092
ethnicHispanic or Latino 11.3214 2.3564 4.805 1.74e-06 ***
ethnicWhite -15.2009 1.9392 -7.839 9.90e-15 ***
ethnicOther, specify 3.4081 5.3290 0.640 0.52259
ethnicMultiracial -3.1475 4.5998 -0.684 0.49394
education:gunownYes 0.3465 0.6311 0.549 0.58306
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 23.25 on 1213 degrees of freedom
(870 observations deleted due to missingness)
Multiple R-squared: 0.2335, Adjusted R-squared: 0.2222
F-statistic: 20.53 on 18 and 1213 DF, p-value: < 2.2e-16
```

We can now interpret this using the `margins`

package (loaded above), which is actually very similar to the `margins`

command in Stata. For this, we need to save the modified data and the model separately:

```
df %>%
mutate(denom = as_factor(denom),
ethnic = as_factor(ethnic)) %>%
# same as above
mutate(gunown = as_factor(gunown),
gunown = relevel(gunown, "No"),
gender = as_factor(gender),
orientself = as_factor(orientself)) -> data_ready
```

```
lm(thermimmigrant ~ gender + orientself + children + education + gunown +
income + denom + ethnic + education:gunown, data = data_ready) -> final_model
```

Now we can use both in the `margins()`

command. Let’s first get the marginal effect for education:

```
factor AME SE z p lower upper
education 1.2640 0.3135 4.0325 0.0001 0.6497 1.8784
```

Now let’s tease out the marginal effect with the interaction (look a the `AME`

column for Average Marginal Effects):

```
margins(model = final_model,
data = data_ready,
variables = "education",
at = list(gunown = c("No","Yes"))) %>%
summary()
```

```
factor gunown AME SE z p lower upper
education 1.0000 1.1498 0.3509 3.2769 0.0010 0.4621 1.8376
education 2.0000 1.4963 0.5610 2.6675 0.0076 0.3969 2.5958
```

Here the `gunown = 1`

and `gunown = 2`

refers to the order of the list - so `1`

here is `No`

while `2`

is Yes. If you remove the `summary`

command, it makes more sense:

```
margins(model = final_model,
data = data_ready,
variables = "education",
at = list(gunown = c("No","Yes")))
```

```
at(gunown) education
No 1.150
Yes 1.496
```