12 HT6 Regression: Interactions

library(tidyverse) # our main collection of functions
library(tidylog) # prints additional output from the tidyverse commands - load after tidyverse 
library(haven) # allows us to load .dta (Stata specific) files
library(here) # needed to navigate to folders and files in a project
library(car) # needed for the linear hypothesis testing after a regression
library(margins) # to ease interpretation

12.1 Load the Data

Load the dataset:

df <- read_dta(here("data","anes08.dta"))

12.2 Run a simple regression of thermimmigrant with gender

To run a simple regression, we again use the lm() command for a linear regression (lm stands for linear regression). As with most datasets in this course, we can take the gender variable just as it comes from Stata - where the variable is labelled and has a numeric value for its labels. Doing so gives us the correct model:

df %>% 
  lm(thermimmigrant ~ gender, data = .) %>% 
  summary

Call:
lm(formula = thermimmigrant ~ gender, data = .)

Residuals:
    Min      1Q  Median      3Q     Max 
-45.885 -15.885   4.115  17.679  57.679 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  42.3206     0.9067  46.677  < 2e-16 ***
gender        3.5647     1.2052   2.958  0.00314 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 27.03 on 2046 degrees of freedom
  (54 observations deleted due to missingness)
Multiple R-squared:  0.004257,  Adjusted R-squared:  0.003771 
F-statistic: 8.748 on 1 and 2046 DF,  p-value: 0.003136

But now we don’t really know what 0 and 1 correspond to! This is why I’d recommend always converting a categorical variable to a factor using the as_factor() command from the haven package, which we have used to import Stata datasets each week. Now if we run the same command, then we get the same coefficient but are certain that 0 = Male and 1 = Female. Indeed, naming the variable female to start with would resolve that problem altogether.

df %>% 
  mutate(gender = as_factor(gender)) %>% 
  lm(thermimmigrant ~ gender, data = .) %>% 
  summary

Call:
lm(formula = thermimmigrant ~ gender, data = .)

Residuals:
    Min      1Q  Median      3Q     Max 
-45.885 -15.885   4.115  17.679  57.679 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)   42.3206     0.9067  46.677  < 2e-16 ***
genderFemale   3.5647     1.2052   2.958  0.00314 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 27.03 on 2046 degrees of freedom
  (54 observations deleted due to missingness)
Multiple R-squared:  0.004257,  Adjusted R-squared:  0.003771 
F-statistic: 8.748 on 1 and 2046 DF,  p-value: 0.003136

12.3 Run a simple regression of thermimmigrant with gunown

The problem above becomes clearer yet again when we use the gunown variable:

df %>% 
  pull(gunown) %>% # we use pull to extract the column as a vector, which print_labels needs - equiv. to print_labels(df$gunown)
  print_labels()

Labels:
 value   label
    -9 Refused
    -8      DK
     1     Yes
     5      No

Now if we were to use the variable “as is,” we’d still get a result - but it would be completely non-sensical because the No variable has a numeric value of 5 where it should be 0 and -8 and -9 should not have a numeric interpretation at all:

df %>% 
  lm(thermimmigrant ~ gunown, data = .) %>% 
  summary

Call:
lm(formula = thermimmigrant ~ gunown, data = .)

Residuals:
    Min      1Q  Median      3Q     Max 
-47.421 -17.421   2.579  12.657  62.657 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  34.8234     1.3805  25.226  < 2e-16 ***
gunown        2.5195     0.3264   7.718 1.84e-14 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 26.71 on 2005 degrees of freedom
  (95 observations deleted due to missingness)
Multiple R-squared:  0.02886,   Adjusted R-squared:  0.02837 
F-statistic: 59.57 on 1 and 2005 DF,  p-value: 1.845e-14

We therefore first convert the variable to a factor and the quickly use the distinct() command to check which observations are actually present:

df %>% 
  mutate(gunown = as_factor(gunown)) %>% 
  distinct(gunown)
distinct: removed 2,099 rows (>99%), 3 rows remaining
# A tibble: 3 x 1
  gunown
  <fct> 
1 Yes   
2 No    
3 <NA>  

Now we only have Yes and No observations, alongside the missing values! This allows us to estimate a useful model now:

df %>% 
  mutate(gunown = as_factor(gunown)) %>% 
  lm(thermimmigrant ~ gunown, data = .) %>% 
  summary

Call:
lm(formula = thermimmigrant ~ gunown, data = .)

Residuals:
    Min      1Q  Median      3Q     Max 
-47.421 -17.421   2.579  12.657  62.657 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   37.343      1.095  34.098  < 2e-16 ***
gunownNo      10.078      1.306   7.718 1.84e-14 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 26.71 on 2005 degrees of freedom
  (95 observations deleted due to missingness)
Multiple R-squared:  0.02886,   Adjusted R-squared:  0.02837 
F-statistic: 59.57 on 1 and 2005 DF,  p-value: 1.845e-14

If you wanted to switch the dummy base around, you could use the relevel() command - you can see that both results are the same!

df %>% 
  mutate(gunown = as_factor(gunown),
         gunown = relevel(gunown, "No")) %>% 
  lm(thermimmigrant ~ gunown, data = .) %>% 
  summary

Call:
lm(formula = thermimmigrant ~ gunown, data = .)

Residuals:
    Min      1Q  Median      3Q     Max 
-47.421 -17.421   2.579  12.657  62.657 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  47.4207     0.7109  66.703  < 2e-16 ***
gunownYes   -10.0778     1.3057  -7.718 1.84e-14 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 26.71 on 2005 degrees of freedom
  (95 observations deleted due to missingness)
Multiple R-squared:  0.02886,   Adjusted R-squared:  0.02837 
F-statistic: 59.57 on 1 and 2005 DF,  p-value: 1.845e-14

12.4 Add socio-economic controls

As we now want to add socio-economic controls, we want to make sure we deal with all dummy variables as well! We therefore convert them to factors again!

The income and education variables are not ideal - education is maxed out at 17 years (all the hard years of a PhD hidden) and the income variable is not divided into equal intervals. At this stage, we don’t deal with them further, but we just leave them as their numerical values as both are ordered in an ordinal sense (the numeric values are ordered) - but ideally we would use the full data, if we had it available.

df %>% 
  mutate(gunown = as_factor(gunown),
         gender = as_factor(gender),
         orientself = as_factor(orientself)) %>% 
  lm(thermimmigrant ~ gender + gunown + orientself + children + education + income, data = .) %>% 
  summary

Call:
lm(formula = thermimmigrant ~ gender + gunown + orientself + 
    children + education + income, data = .)

Residuals:
    Min      1Q  Median      3Q     Max 
-55.446 -17.556   2.042  16.036  63.912 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)           35.0699     4.3373   8.086 1.47e-15 ***
genderFemale           3.5826     1.4808   2.419  0.01569 *  
gunownNo               9.3904     1.6082   5.839 6.71e-09 ***
orientselfHomosexual   2.8213     5.2348   0.539  0.59002    
orientselfBisexual     4.7749     5.6770   0.841  0.40046    
children               1.5955     0.5857   2.724  0.00654 ** 
education              0.3235     0.3321   0.974  0.33006    
income                -0.2955     0.1482  -1.993  0.04646 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 25.73 on 1230 degrees of freedom
  (864 observations deleted due to missingness)
Multiple R-squared:  0.0514,    Adjusted R-squared:  0.046 
F-statistic:  9.52 on 7 and 1230 DF,  p-value: 1.483e-11

We see that sexual orientation and education do not seem to explain variation in the feeling towards immigrants. However, gender, owning a gun, having children and higher incomes seem to explain some variation.

12.5 We now add Religion

Because we already looked at the religion variable a few weeks ago, we know that they are not ordinal - so we need to convert them to a factor (just like all others we did above):

df %>% 
  mutate(denom = as_factor(denom)) %>% 
  
  # same as above
  mutate(gunown = as_factor(gunown),
         gender = as_factor(gender),
         orientself = as_factor(orientself)) %>% 
  lm(thermimmigrant ~ gender + gunown + orientself + children + education + income + denom, data = .) %>% 
  summary

Call:
lm(formula = thermimmigrant ~ gender + gunown + orientself + 
    children + education + income + denom, data = .)

Residuals:
    Min      1Q  Median      3Q     Max 
-64.448 -16.167   1.902  15.383  64.825 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)           30.3971     4.3364   7.010 3.94e-12 ***
genderFemale           3.5636     1.4554   2.448  0.01449 *  
gunownNo               7.9780     1.5841   5.036 5.46e-07 ***
orientselfHomosexual   2.0767     5.1345   0.404  0.68595    
orientselfBisexual     5.4529     5.5565   0.981  0.32661    
children               1.4905     0.5740   2.597  0.00953 ** 
education              0.5165     0.3279   1.575  0.11549    
income                -0.3722     0.1451  -2.564  0.01046 *  
denomCatholic         14.2423     1.7694   8.049 1.96e-15 ***
denomJewish            7.4824     6.4638   1.158  0.24726    
denomOther             5.6566     3.6556   1.547  0.12203    
denomNone              3.5929     2.0145   1.783  0.07475 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 25.1 on 1224 degrees of freedom
  (866 observations deleted due to missingness)
Multiple R-squared:  0.09946,   Adjusted R-squared:  0.09137 
F-statistic: 12.29 on 11 and 1224 DF,  p-value: < 2.2e-16

So we see that compared to our reference cateogry protestant, only catholics seem to have a clearly more positive view towards immigrants.

We can again use relevel() to change that reference category e.g. to None:

df %>% 
  mutate(denom = as_factor(denom),
         denom = relevel(denom, "None")) %>% 
  
  # same as above
  mutate(gunown = as_factor(gunown),
         gender = as_factor(gender),
         orientself = as_factor(orientself)) %>% 
  lm(thermimmigrant ~ gender + gunown + orientself + children + education + income + denom, data = .) %>% 
  summary

Call:
lm(formula = thermimmigrant ~ gender + gunown + orientself + 
    children + education + income + denom, data = .)

Residuals:
    Min      1Q  Median      3Q     Max 
-64.448 -16.167   1.902  15.383  64.825 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)           33.9900     4.5565   7.460 1.64e-13 ***
genderFemale           3.5636     1.4554   2.448  0.01449 *  
gunownNo               7.9780     1.5841   5.036 5.46e-07 ***
orientselfHomosexual   2.0767     5.1345   0.404  0.68595    
orientselfBisexual     5.4529     5.5565   0.981  0.32661    
children               1.4905     0.5740   2.597  0.00953 ** 
education              0.5165     0.3279   1.575  0.11549    
income                -0.3722     0.1451  -2.564  0.01046 *  
denomProtestant       -3.5929     2.0145  -1.783  0.07475 .  
denomCatholic         10.6494     2.2978   4.635 3.96e-06 ***
denomJewish            3.8895     6.6115   0.588  0.55644    
denomOther             2.0637     3.9344   0.525  0.60000    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 25.1 on 1224 degrees of freedom
  (866 observations deleted due to missingness)
Multiple R-squared:  0.09946,   Adjusted R-squared:  0.09137 
F-statistic: 12.29 on 11 and 1224 DF,  p-value: < 2.2e-16

12.6 What if we add ethnicity?

Do you think being catholic really makes are more favourable to immigration? What if you add ethnicity to the model?

We keep our base as having no denomimation and now add ethnicity and define it’s base as “White” (default is “Black”):

df %>% 
  mutate(denom = as_factor(denom),
         denom = relevel(denom, "None"),
         ethnic = as_factor(ethnic),
         ethnic = relevel(ethnic, "White")) %>% 
  
  # same as above
  mutate(gunown = as_factor(gunown),
         gender = as_factor(gender),
         orientself = as_factor(orientself)) %>% 
  lm(thermimmigrant ~ gender + gunown + orientself + children + education + income + denom + ethnic, data = .) %>% 
  summary

Call:
lm(formula = thermimmigrant ~ gender + gunown + orientself + 
    children + education + income + denom + ethnic, data = .)

Residuals:
    Min      1Q  Median      3Q     Max 
-74.012 -15.065   1.249  15.746  69.618 

Coefficients:
                         Estimate Std. Error t value Pr(>|t|)    
(Intercept)               19.2940     4.3705   4.415 1.10e-05 ***
genderFemale               3.5302     1.3516   2.612 0.009117 ** 
gunownNo                   3.7857     1.5099   2.507 0.012295 *  
orientselfHomosexual       1.8841     4.8679   0.387 0.698793    
orientselfBisexual         8.3188     5.1677   1.610 0.107709    
children                   0.3179     0.5384   0.590 0.555013    
education                  1.2398     0.3102   3.996 6.83e-05 ***
income                    -0.1870     0.1372  -1.363 0.173058    
denomProtestant           -5.7132     1.9295  -2.961 0.003127 ** 
denomCatholic              1.7981     2.2444   0.801 0.423186    
denomJewish                4.6333     6.1296   0.756 0.449856    
denomOther                -2.6382     3.7085  -0.711 0.476974    
ethnicBlack               15.2430     1.9371   7.869 7.88e-15 ***
ethnicAsian                7.9853     5.6650   1.410 0.158918    
ethnicNative American     17.7684     8.2807   2.146 0.032090 *  
ethnicHispanic or Latino  26.6278     1.9417  13.714  < 2e-16 ***
ethnicOther, specify      18.6619     5.1948   3.592 0.000341 ***
ethnicMultiracial         11.9962     4.3617   2.750 0.006042 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 23.24 on 1214 degrees of freedom
  (870 observations deleted due to missingness)
Multiple R-squared:  0.2333,    Adjusted R-squared:  0.2226 
F-statistic: 21.74 on 17 and 1214 DF,  p-value: < 2.2e-16

This shows a bit more clearly, that catholics do not seem to have a better view towards immigrants - only protestants have a generally more negative view of immigrants than people without a denomination.

With regard to ethnicity, however, we see that Hispanics, Other ethnicities, African Americans and people of multiracial ethnicity have all more positive views of immigrants.

12.7 Interaction of Female and gunowner

We start out as we always do, but note that in the regression equation in lm(), we add a term gender:gunown, which creates our interaction variable! We could keep the gender + gunown terms as well, but we don’t have to - so let’s remove them because then we get a bit more output for each possible case (Note: one will be NA because this is going to be perfectly collinear with the intercept).

Because we will need the model below for a hypothesis test, we save the model first as interaction_model!

df %>% 
  mutate(denom = as_factor(denom),
         ethnic = as_factor(ethnic)) %>% 
  
  # same as above
  mutate(gunown = as_factor(gunown),
         gunwon = relevel(gunown, "No"),
         gender = as_factor(gender),
         orientself = as_factor(orientself)) %>% 
  lm(thermimmigrant ~ orientself + children + education + 
       income + denom + ethnic + gender:gunown, data = .) %>% 
  summary

Call:
lm(formula = thermimmigrant ~ orientself + children + education + 
    income + denom + ethnic + gender:gunown, data = .)

Residuals:
    Min      1Q  Median      3Q     Max 
-74.219 -14.960   1.268  15.648  69.182 

Coefficients: (1 not defined because of singularities)
                         Estimate Std. Error t value Pr(>|t|)    
(Intercept)               36.3433     3.9489   9.203  < 2e-16 ***
orientselfHomosexual       2.0051     4.8770   0.411 0.681043    
orientselfBisexual         8.3287     5.1695   1.611 0.107412    
children                   0.3183     0.5385   0.591 0.554655    
education                  1.2395     0.3103   3.994 6.89e-05 ***
income                    -0.1892     0.1373  -1.378 0.168457    
denomCatholic              7.5161     1.8508   4.061 5.20e-05 ***
denomJewish               10.3125     6.0137   1.715 0.086633 .  
denomOther                 3.0904     3.4604   0.893 0.371992    
denomNone                  5.6750     1.9321   2.937 0.003374 ** 
ethnicAsian               -7.2551     5.8369  -1.243 0.214118    
ethnicNative American      2.4607     8.4027   0.293 0.769687    
ethnicHispanic or Latino  11.4041     2.3540   4.844 1.43e-06 ***
ethnicWhite              -15.2127     1.9390  -7.846 9.39e-15 ***
ethnicOther, specify       3.4573     5.3298   0.649 0.516671    
ethnicMultiracial         -3.1859     4.5985  -0.693 0.488562    
genderMale:gunownYes      -7.0825     2.0568  -3.443 0.000594 ***
genderFemale:gunownYes    -4.4006     2.0390  -2.158 0.031111 *  
genderMale:gunownNo       -3.9520     1.6466  -2.400 0.016539 *  
genderFemale:gunownNo          NA         NA      NA       NA    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 23.25 on 1213 degrees of freedom
  (870 observations deleted due to missingness)
Multiple R-squared:  0.2335,    Adjusted R-squared:  0.2221 
F-statistic: 20.53 on 18 and 1213 DF,  p-value: < 2.2e-16

The interpretation goes as follows: Our base term is always genderFemale:gunownNo, so we compare each other coefficient to Women who do not own guns.

  • Women who own guns have a -4.4 unit less favourable view towards immigrants than Women who do not own guns.
  • Men who own guns have a -7 unit less favourable view towards immigrants than Women who do not own guns.
  • Men who do not own guns have a 3.95 unit less favourable view towards immigrants than Women who do not own guns.

All of these terms are significant at the 5% significance level, but only one is significant at the 1% significance level.

12.7.1 Testing this

We can now formally test some hypotheses using an F-test. We can do this using the linearHypothesis() command from the car package, which we loaded above.

The first option for this command is the model we want to test - here we can simply pipe %>% our model to the linearHypothesis() command. Now we need to insert some arguments to the hypothesis.matrix.

To illustrate two different ways we could use those hypothesis tests, we will save two models quickly: the interaction_model and the interaction_model_full. Both models are identical, apart from how they present their results.

The interaction_model uses the variable logic of \(x_1 + x_2 + x_1*x_2\) while interaction_model_full uses the special : operator and adds all different combinations of interactions to the model. Compare the following two outputs:

interaction_model

summary(interaction_model)

Call:
lm(formula = thermimmigrant ~ gender + gunown + orientself + 
    children + education + income + denom + ethnic + gender:gunown, 
    data = .)

Residuals:
    Min      1Q  Median      3Q     Max 
-74.219 -14.960   1.268  15.648  69.182 

Coefficients:
                         Estimate Std. Error t value Pr(>|t|)    
(Intercept)               32.3913     3.9603   8.179 7.15e-16 ***
genderFemale               3.9520     1.6466   2.400  0.01654 *  
gunownYes                 -3.1305     2.1004  -1.490  0.13637    
orientselfHomosexual       2.0051     4.8770   0.411  0.68104    
orientselfBisexual         8.3287     5.1695   1.611  0.10741    
children                   0.3183     0.5385   0.591  0.55466    
education                  1.2395     0.3103   3.994 6.89e-05 ***
income                    -0.1892     0.1373  -1.378  0.16846    
denomCatholic              7.5161     1.8508   4.061 5.20e-05 ***
denomJewish               10.3125     6.0137   1.715  0.08663 .  
denomOther                 3.0904     3.4604   0.893  0.37199    
denomNone                  5.6750     1.9321   2.937  0.00337 ** 
ethnicAsian               -7.2551     5.8369  -1.243  0.21412    
ethnicNative American      2.4607     8.4027   0.293  0.76969    
ethnicHispanic or Latino  11.4041     2.3540   4.844 1.43e-06 ***
ethnicWhite              -15.2127     1.9390  -7.846 9.39e-15 ***
ethnicOther, specify       3.4573     5.3298   0.649  0.51667    
ethnicMultiracial         -3.1859     4.5985  -0.693  0.48856    
genderFemale:gunownYes    -1.2701     2.8295  -0.449  0.65360    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 23.25 on 1213 degrees of freedom
  (870 observations deleted due to missingness)
Multiple R-squared:  0.2335,    Adjusted R-squared:  0.2221 
F-statistic: 20.53 on 18 and 1213 DF,  p-value: < 2.2e-16

interaction_model_full

summary(interaction_model_full)

Call:
lm(formula = thermimmigrant ~ orientself + children + education + 
    income + denom + ethnic + gender:gunown, data = .)

Residuals:
    Min      1Q  Median      3Q     Max 
-74.219 -14.960   1.268  15.648  69.182 

Coefficients: (1 not defined because of singularities)
                         Estimate Std. Error t value Pr(>|t|)    
(Intercept)               31.9427     4.2872   7.451 1.76e-13 ***
orientselfHomosexual       2.0051     4.8770   0.411  0.68104    
orientselfBisexual         8.3287     5.1695   1.611  0.10741    
children                   0.3183     0.5385   0.591  0.55466    
education                  1.2395     0.3103   3.994 6.89e-05 ***
income                    -0.1892     0.1373  -1.378  0.16846    
denomCatholic              7.5161     1.8508   4.061 5.20e-05 ***
denomJewish               10.3125     6.0137   1.715  0.08663 .  
denomOther                 3.0904     3.4604   0.893  0.37199    
denomNone                  5.6750     1.9321   2.937  0.00337 ** 
ethnicAsian               -7.2551     5.8369  -1.243  0.21412    
ethnicNative American      2.4607     8.4027   0.293  0.76969    
ethnicHispanic or Latino  11.4041     2.3540   4.844 1.43e-06 ***
ethnicWhite              -15.2127     1.9390  -7.846 9.39e-15 ***
ethnicOther, specify       3.4573     5.3298   0.649  0.51667    
ethnicMultiracial         -3.1859     4.5985  -0.693  0.48856    
genderMale:gunownNo        0.4486     2.1079   0.213  0.83150    
genderFemale:gunownNo      4.4006     2.0390   2.158  0.03111 *  
genderMale:gunownYes      -2.6819     2.3236  -1.154  0.24864    
genderFemale:gunownYes         NA         NA      NA       NA    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 23.25 on 1213 degrees of freedom
  (870 observations deleted due to missingness)
Multiple R-squared:  0.2335,    Adjusted R-squared:  0.2221 
F-statistic: 20.53 on 18 and 1213 DF,  p-value: < 2.2e-16

Personally, I find the second output of the interaction_model_full more easy to interpret.

Now we can test all different effects:

12.7.2 What is the total effect for a female gunowner?

Now we need to insert some arguments to the hypothesis.matrix. We want to test for a difference between e.g. a woman who owns a gun with e.g. a man who does not own a gun (our base).

To do this using our interaction_model, we need to use a vector that describes the situation of being a female and owning a gun. We then set this equation to 0 using = 0 because the case where all of these effects are 0 would coincide with a man not owning a gun (our base). This would look like this: "1.genderFemale + 1.gunownYes + 1.genderFemale:gunownYes = 0" where now we set the genderFemale to 1 to indicate a woman, the gunown to 1 to indicate her owning a gun. The interaction then of course will also be 1:

interaction_model %>% 
  linearHypothesis(hypothesis.matrix = "1.genderFemale + 1.gunownYes + 1.genderFemale:gunownYes = 0")
Linear hypothesis test

Hypothesis:
genderFemale  + gunownYes  + genderFemale:gunownYes = 0

Model 1: restricted model
Model 2: thermimmigrant ~ gender + gunown + orientself + children + education + 
    income + denom + ethnic + gender:gunown

  Res.Df    RSS Df Sum of Sq      F Pr(>F)
1   1214 655705                           
2   1213 655680  1    24.482 0.0453 0.8315

Alternatively, we can use the interaction_model_full model and test explicitly whether the genderMale:gunownNo is different to genderFemale:gunownYes, which is equivalent to the intercept (i.e. identical to 1). This gives us a much easier formula: "genderMale:gunownNo = 0". Note: we need to use the option singluar.ok=TRUE, because one coefficient in interaction_model_full is not defined (as it is collinear with the intercept):

interaction_model_full %>% 
  linearHypothesis(singular.ok = TRUE, 
                   hypothesis.matrix = "genderMale:gunownNo = 0")
Linear hypothesis test

Hypothesis:
genderMale:gunownNo = 0

Model 1: restricted model
Model 2: thermimmigrant ~ orientself + children + education + income + 
    denom + ethnic + gender:gunown

  Res.Df    RSS Df Sum of Sq      F Pr(>F)
1   1214 655705                           
2   1213 655680  1    24.482 0.0453 0.8315

As you can see both outcomes are equivalent: Our linear hypothesis test now tests a Model 1, which is our restricted model, to the model we calculated just before using an F-test (using the F-distribution). Becase our p-value (here called Pr(>F)) is \(p>0.05\), we cannot reject the \(H_0\) of a difference between those two models. We therefore conclude that Women with a gun do not express different values towards immigration than men without a gun.

12.7.3 What is the total effect for male gunowners?

Again we compare this with our base category: Men who do not own guns.

interaction_model %>% 
  linearHypothesis(hypothesis.matrix = "0.genderFemale + 1.gunownYes + 0.genderFemale:gunownYes = 0")
Linear hypothesis test

Hypothesis:
gunownYes = 0

Model 1: restricted model
Model 2: thermimmigrant ~ gender + gunown + orientself + children + education + 
    income + denom + ethnic + gender:gunown

  Res.Df    RSS Df Sum of Sq      F Pr(>F)
1   1214 656881                           
2   1213 655680  1    1200.8 2.2214 0.1364

Or we could write this more succintly, using just "1.gunownYes = 0" beacuse everything else is 0 by default (and Male is 0).

Of course using interaction_model_full we can use this:

interaction_model_full %>% 
  linearHypothesis(singular.ok = TRUE, 
                   hypothesis.matrix = "genderMale:gunownYes = genderMale:gunownNo")
Linear hypothesis test

Hypothesis:
- genderMale:gunownNo  + genderMale:gunownYes = 0

Model 1: restricted model
Model 2: thermimmigrant ~ orientself + children + education + income + 
    denom + ethnic + gender:gunown

  Res.Df    RSS Df Sum of Sq      F Pr(>F)
1   1214 656881                           
2   1213 655680  1    1200.8 2.2214 0.1364

12.7.4 What is the total effect for female non-gunowner?

Again we compare this with our base category: Men who do not own guns. Now using this more succintly:

interaction_model %>% 
  linearHypothesis(hypothesis.matrix = "1.genderFemale = 0")
Linear hypothesis test

Hypothesis:
genderFemale = 0

Model 1: restricted model
Model 2: thermimmigrant ~ gender + gunown + orientself + children + education + 
    income + denom + ethnic + gender:gunown

  Res.Df    RSS Df Sum of Sq      F  Pr(>F)  
1   1214 658794                              
2   1213 655680  1    3113.9 5.7606 0.01654 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

12.7.5 What is the total effect for a male non-gunowner?

Well, because this is our base category, we don’t need to test whether this case is different to it!

12.8 Interaction between education and gunowners

Despite what we mentioned above, we treat education as continuous here.

We follow the same logic as before:

df %>% 
  
  mutate(denom = as_factor(denom),
         ethnic = as_factor(ethnic)) %>% 
  
  # same as above
  mutate(gunown = as_factor(gunown),
         gunown = relevel(gunown, "No"),
         gender = as_factor(gender),
         orientself = as_factor(orientself)) %>% 
  lm(thermimmigrant ~ gender + orientself + children + education + gunown + 
       income + denom + ethnic + education:gunown, data = .) %>%
  summary

Call:
lm(formula = thermimmigrant ~ gender + orientself + children + 
    education + gunown + income + denom + ethnic + education:gunown, 
    data = .)

Residuals:
    Min      1Q  Median      3Q     Max 
-73.701 -15.181   1.362  15.878  69.247 

Coefficients:
                         Estimate Std. Error t value Pr(>|t|)    
(Intercept)               33.7645     4.4578   7.574 7.13e-14 ***
genderFemale               3.5374     1.3520   2.616  0.00900 ** 
orientselfHomosexual       1.9894     4.8731   0.408  0.68317    
orientselfBisexual         8.3078     5.1692   1.607  0.10828    
children                   0.3217     0.5386   0.597  0.55043    
education                  1.1498     0.3509   3.277  0.00108 ** 
gunownYes                 -8.4467     8.6223  -0.980  0.32746    
income                    -0.1886     0.1372  -1.374  0.16957    
denomCatholic              7.5464     1.8518   4.075 4.90e-05 ***
denomJewish               10.6087     6.0319   1.759  0.07887 .  
denomOther                 3.1743     3.4648   0.916  0.35976    
denomNone                  5.7339     1.9305   2.970  0.00303 ** 
ethnicAsian               -7.0251     5.8520  -1.200  0.23020    
ethnicNative American      2.5568     8.4013   0.304  0.76092    
ethnicHispanic or Latino  11.3214     2.3564   4.805 1.74e-06 ***
ethnicWhite              -15.2009     1.9392  -7.839 9.90e-15 ***
ethnicOther, specify       3.4081     5.3290   0.640  0.52259    
ethnicMultiracial         -3.1475     4.5998  -0.684  0.49394    
education:gunownYes        0.3465     0.6311   0.549  0.58306    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 23.25 on 1213 degrees of freedom
  (870 observations deleted due to missingness)
Multiple R-squared:  0.2335,    Adjusted R-squared:  0.2222 
F-statistic: 20.53 on 18 and 1213 DF,  p-value: < 2.2e-16

We can now interpret this using the margins package (loaded above), which is actually very similar to the margins command in Stata. For this, we need to save the modified data and the model separately:

df %>%
  mutate(denom = as_factor(denom),
         ethnic = as_factor(ethnic)) %>%
  
  # same as above
  mutate(gunown = as_factor(gunown),
         gunown = relevel(gunown, "No"),
         gender = as_factor(gender),
         orientself = as_factor(orientself)) -> data_ready
lm(thermimmigrant ~ gender + orientself + children + education + gunown +
     income + denom + ethnic + education:gunown, data = data_ready) -> final_model

Now we can use both in the margins() command. Let’s first get the marginal effect for education:

margins(model = final_model, 
        data = data_ready,
        variables = c("education")) %>% summary
    factor    AME     SE      z      p  lower  upper
 education 1.2640 0.3135 4.0325 0.0001 0.6497 1.8784

Now let’s tease out the marginal effect with the interaction (look a the AME column for Average Marginal Effects):

margins(model = final_model, 
        data = data_ready, 
        variables = "education", 
        at = list(gunown = c("No","Yes"))) %>% 
  summary()
    factor gunown    AME     SE      z      p  lower  upper
 education 1.0000 1.1498 0.3509 3.2769 0.0010 0.4621 1.8376
 education 2.0000 1.4963 0.5610 2.6675 0.0076 0.3969 2.5958

Here the gunown = 1 and gunown = 2 refers to the order of the list - so 1 here is No while 2 is Yes. If you remove the summary command, it makes more sense:

margins(model = final_model, 
        data = data_ready, 
        variables = "education", 
        at = list(gunown = c("No","Yes")))
 at(gunown) education
         No     1.150
        Yes     1.496