Skip to contents

Prepare the clean dataframe for modelling

Usage

clean_data(raw_data, max.ar = 4, max.dl = 2, trend = TRUE)

Arguments

raw_data

A tibble or data.frame with the y variable and the x variables. Needs to have a column called 'time', which is of class Date. Variable names need to be in column 'na_item', and values in column 'values'.

max.ar

Integer. The maximum number of lags to use for the AR terms. as well as for the independent variables.

max.dl

Integer. The maximum number of lags to use for the independent variables (the distributed lags).

trend

Logical. Should a trend be added? Default is TRUE.

Value

A tibble with the cleaned data.

Examples

sample_data <- dplyr::tibble(
  time = rep(seq.Date(
    from = as.Date("2000-01-01"),
    to = as.Date("2000-12-31"), by = 1
  ), each = 2),
  na_item = rep(c("yvar", "xvar"), 366), values = rnorm(366 * 2, mean = 100)
)
osem:::clean_data(sample_data, max.ar = 4, max.dl = 4)
#> # A tibble: 366 × 46
#>    index time       trend  yvar  xvar ln.yvar ln.xvar D.yvar  D.xvar D.ln.yvar
#>    <int> <date>     <dbl> <dbl> <dbl>   <dbl>   <dbl>  <dbl>   <dbl>     <dbl>
#>  1     1 2000-01-01     1  98.6 100.     4.59    4.61 NA     NA       NA      
#>  2     2 2000-01-02     2  97.6 100.     4.58    4.61 -1.04  -0.261   -0.0106 
#>  3     3 2000-01-03     3 101.  101.     4.61    4.62  3.06   1.15     0.0309 
#>  4     4 2000-01-04     4  98.2  99.8    4.59    4.60 -2.44  -1.40    -0.0246 
#>  5     5 2000-01-05     5  99.8  99.7    4.60    4.60  1.58  -0.0354   0.0159 
#>  6     6 2000-01-06     6  99.4 101.     4.60    4.61 -0.309  0.912   -0.00311
#>  7     7 2000-01-07     7 102.   98.4    4.63    4.59  2.62  -2.26     0.0260 
#>  8     8 2000-01-08     8 101.   98.1    4.61    4.59 -1.55  -0.232   -0.0153 
#>  9     9 2000-01-09     9  99.5  99.9    4.60    4.60 -1.03   1.81    -0.0103 
#> 10    10 2000-01-10    10 101.   99.1    4.61    4.60  1.07  -0.861    0.0106 
#> # ℹ 356 more rows
#> # ℹ 36 more variables: D.ln.xvar <dbl>, L1.yvar <dbl>, L1.xvar <dbl>,
#> #   L1.ln.yvar <dbl>, L1.ln.xvar <dbl>, L1.D.yvar <dbl>, L1.D.xvar <dbl>,
#> #   L1.D.ln.yvar <dbl>, L1.D.ln.xvar <dbl>, L2.yvar <dbl>, L2.xvar <dbl>,
#> #   L2.ln.yvar <dbl>, L2.ln.xvar <dbl>, L2.D.yvar <dbl>, L2.D.xvar <dbl>,
#> #   L2.D.ln.yvar <dbl>, L2.D.ln.xvar <dbl>, L3.yvar <dbl>, L3.xvar <dbl>,
#> #   L3.ln.yvar <dbl>, L3.ln.xvar <dbl>, L3.D.yvar <dbl>, L3.D.xvar <dbl>, …