Block-based General-to-Specific (GETS) modelling
blocksFun.Rd
Auxiliary function (i.e. not intended for the average user) that enables block-based GETS-modelling with user-specified estimator, diagnostics and goodness-of-fit criterion.
Usage
blocksFun(y, x, untransformed.residuals=NULL, blocks=NULL,
no.of.blocks=NULL, max.block.size=30, ratio.threshold=0.8,
gets.of.union=TRUE, force.invertibility=FALSE,
user.estimator=list(name="ols"), t.pval=0.001, wald.pval=t.pval,
do.pet=FALSE, ar.LjungB=NULL, arch.LjungB=NULL, normality.JarqueB=NULL,
user.diagnostics=NULL, gof.function=list(name="infocrit"),
gof.method=c("min", "max"), keep=NULL, include.gum=FALSE,
include.1cut=FALSE, include.empty=FALSE, max.paths=NULL,
turbo=FALSE, parallel.options=NULL, tol=1e-07, LAPACK=FALSE,
max.regs=NULL, print.searchinfo=TRUE, alarm=FALSE)
Arguments
- y
a numeric vector (with no missing values, i.e. no non-numeric 'holes')
- x
a
matrix
, or alist
of matrices- untransformed.residuals
NULL
(default) or, whenols
is used withmethod=6
inuser.estimator
, a numeric vector containing the untransformed residuals- blocks
NULL
(default) or alist
of lists with vectors of integers that indicate how blocks should be put together. IfNULL
, then the block composition is undertaken automatically by an internal algorithm that depends onno.of.blocks
,max.block.size
andratio.threshold
- no.of.blocks
NULL
(default) orinteger
. IfNULL
, then the number of blocks is determined automatically by an internal algorithm- max.block.size
integer
that controls the size of blocks- ratio.threshold
numeric
between 0 and 1 that controls the minimum ratio of variables in each block to total observations- gets.of.union
logical
. IfTRUE
(default), then GETS modelling is undertaken of the union of retained variables. Otherwise it is not- force.invertibility
logical
. IfTRUE
, then the x-matrix is ensured to have full row-rank before it is passed on togetsFun
- user.estimator
list
, seegetsFun
for the details- t.pval
numeric
value between 0 and 1. The significance level used for the two-sided coefficient significance t-tests- wald.pval
numeric
value between 0 and 1. The significance level used for the Parsimonious Encompassing Tests (PETs)- do.pet
logical
. IfTRUE
, then a Parsimonious Encompassing Test (PET) against the GUM is undertaken at each variable removal for the joint significance of all the deleted regressors along the current GETS path. IfFALSE
, then a PET is not undertaken at each removal- ar.LjungB
a two element
vector
, orNULL
. In the former case, the first element contains the AR-order, the second element the significance level. IfNULL
, then a test for autocorrelation in the residuals is not conducted- arch.LjungB
a two element
vector
, orNULL
. In the former case, the first element contains the ARCH-order, the second element the significance level. IfNULL
, then a test for ARCH in the residuals is not conducted- normality.JarqueB
NULL
or anumeric
value between 0 and 1. In the latter case, a test for non-normality in the residuals is conducted using a significance level equal tonormality.JarqueB
. IfNULL
, then no test for non-normality is conducted- user.diagnostics
NULL
(default) or alist
with two entries,name
andpval
. SeegetsFun
for the details- gof.function
list
. The first item should be namedname
and contain the name (a character) of the Goodness-of-Fit (GOF) function used. Additional items in the listgof.function
are passed on as arguments to the GOF-function. . SeegetsFun
for the details- gof.method
character
. Determines whether the best Goodness-of-Fit is a minimum (default) or maximum- keep
NULL
(default),vector
of integers or alist
of vectors of integers. In the latter case, the number of vectors should be equal to the number of matrices inx
- include.gum
logical
. IfTRUE
, then the GUM (i.e. the starting model) is included among the terminal models- include.1cut
logical
. IfTRUE
, then the 1-cut model is added to the list of terminal models- include.empty
logical
. IfTRUE
, then the empty model is added to the list of terminal models- max.paths
NULL
(default) orinteger
greater than 0. IfNULL
, then there is no limit to the number of paths. Ifinteger
(e.g. 1), then this integer constitutes the maximum number of paths searched (e.g. a single path)- turbo
logical
. IfTRUE
, then (parts of) paths are not searched twice (or more) unnecessarily in each GETS modelling. Settingturbo
toTRUE
entails a small additional computational costs, but may be outweighed substantially if estimation is slow, or if the number of variables to delete in each path is large- parallel.options
NULL
orinteger
that indicates the number of cores/threads to use for parallel computing (implemented w/makeCluster
andparLapply
)- tol
numeric
value, the tolerance for detecting linear dependencies in the columns of the variance-covariance matrix when computing the Wald-statistic used in the Parsimonious Encompassing Tests (PETs), see theqr.solve
function- LAPACK
currently not used
- max.regs
integer
. The maximum number of regressions along a deletion path. Do not alter unless you know what you are doing!- print.searchinfo
logical
. IfTRUE
(default), then a print is returned whenever simiplification along a new path is started- alarm
logical
. IfTRUE
, then a sound or beep is emitted (in order to alert the user) when the model selection ends
Details
blocksFun
undertakes block-based GETS modelling by a repeated but structured call to getsFun
. For the details of how to user-specify an estimator via user.estimator
, diagnostics via
user.diagnostics
and a goodness-of-fit function via gof.function
, see documentation of getsFun
under "Details".
The algorithm of blocksFun
is similar to that of isat
, but more flexible. The main use of blocksFun
is the creation of user-specified methods that employs block-based GETS modelling, e.g. indicator saturation techniques.
Value
A list
with the results of the block-based GETS-modelling.
References
F. Pretis, J. Reade and G. Sucarrat (2018): 'Automated General-to-Specific (GETS) Regression Modeling and Indicator Saturation for Outliers and Structural Breaks'. Journal of Statistical Software 86, Number 3, pp. 1-44
G. sucarrat (2020): 'User-Specified General-to-Specific and Indicator Saturation Methods'. The R Journal 12 issue 2, pp. 388-401, https://journal.r-project.org/archive/2021/RJ-2021-024/
See also
getsFun
, ols
, diagnostics
, infocrit
and isat
Examples
## more variables than observations:
y <- rnorm(20)
x <- matrix(rnorm(length(y)*40), length(y), 40)
blocksFun(y, x)
#>
#> x block 1 of 3:
#> 14 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#> 13
#> 14
#>
#> x block 2 of 3:
#> 13 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#> 13
#>
#> x block 3 of 3:
#> 12 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#> $call
#> blocksFun(y, x)
#>
#> $time.started
#> [1] "Sat Jul 27 15:29:47 2024"
#>
#> $time.finished
#> [1] "Sat Jul 27 15:29:47 2024"
#>
#> $y
#> [1] 0.42646422 -0.29507148 0.89512566 0.87813349 0.82158108 0.68864025
#> [7] 0.55391765 -0.06191171 -0.30596266 -0.38047100 -0.69470698 -0.20791728
#> [13] -1.26539635 2.16895597 1.20796200 -1.12310858 -0.40288484 -0.46665535
#> [19] 0.77996512 -0.08336907
#>
#> $x
#> $x$x
#> [1] "X1.xreg1" "X1.xreg2" "X1.xreg3" "X1.xreg4" "X1.xreg5" "X1.xreg6"
#> [7] "X1.xreg7" "X1.xreg8" "X1.xreg9" "X1.xreg10" "X1.xreg11" "X1.xreg12"
#> [13] "X1.xreg13" "X1.xreg14" "X1.xreg15" "X1.xreg16" "X1.xreg17" "X1.xreg18"
#> [19] "X1.xreg19" "X1.xreg20" "X1.xreg21" "X1.xreg22" "X1.xreg23" "X1.xreg24"
#> [25] "X1.xreg25" "X1.xreg26" "X1.xreg27" "X1.xreg28" "X1.xreg29" "X1.xreg30"
#> [31] "X1.xreg31" "X1.xreg32" "X1.xreg33" "X1.xreg34" "X1.xreg35" "X1.xreg36"
#> [37] "X1.xreg37" "X1.xreg38" "X1.xreg39" "X1.xreg40"
#>
#>
#> $blocks
#> $blocks$x
#> $blocks$x[[1]]
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14
#>
#> $blocks$x[[2]]
#> [1] 15 16 17 18 19 20 21 22 23 24 25 26 27 28
#>
#> $blocks$x[[3]]
#> [1] 29 30 31 32 33 34 35 36 37 38 39 40
#>
#>
#>
#> $specific.spec
#> $specific.spec$x
#> integer(0)
#>
#>
## 'x' as list of matrices:
z <- matrix(rnorm(length(y)*40), length(y), 40)
blocksFun(y, list(x,z))
#>
#> X1 block 1 of 3:
#> 14 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#> 13
#> 14
#>
#> X1 block 2 of 3:
#> 13 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#> 13
#>
#> X1 block 3 of 3:
#> 12 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#>
#> X2 block 1 of 3:
#> 14 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#> 13
#> 14
#>
#> X2 block 2 of 3:
#> 14 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#> 13
#> 14
#>
#> X2 block 3 of 3:
#> 12 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#> $call
#> blocksFun(y, list(x, z))
#>
#> $time.started
#> [1] "Sat Jul 27 15:29:47 2024"
#>
#> $time.finished
#> [1] "Sat Jul 27 15:29:47 2024"
#>
#> $y
#> [1] 0.42646422 -0.29507148 0.89512566 0.87813349 0.82158108 0.68864025
#> [7] 0.55391765 -0.06191171 -0.30596266 -0.38047100 -0.69470698 -0.20791728
#> [13] -1.26539635 2.16895597 1.20796200 -1.12310858 -0.40288484 -0.46665535
#> [19] 0.77996512 -0.08336907
#>
#> $x
#> $x$X1
#> [1] "X1.xreg1" "X1.xreg2" "X1.xreg3" "X1.xreg4" "X1.xreg5" "X1.xreg6"
#> [7] "X1.xreg7" "X1.xreg8" "X1.xreg9" "X1.xreg10" "X1.xreg11" "X1.xreg12"
#> [13] "X1.xreg13" "X1.xreg14" "X1.xreg15" "X1.xreg16" "X1.xreg17" "X1.xreg18"
#> [19] "X1.xreg19" "X1.xreg20" "X1.xreg21" "X1.xreg22" "X1.xreg23" "X1.xreg24"
#> [25] "X1.xreg25" "X1.xreg26" "X1.xreg27" "X1.xreg28" "X1.xreg29" "X1.xreg30"
#> [31] "X1.xreg31" "X1.xreg32" "X1.xreg33" "X1.xreg34" "X1.xreg35" "X1.xreg36"
#> [37] "X1.xreg37" "X1.xreg38" "X1.xreg39" "X1.xreg40"
#>
#> $x$X2
#> [1] "X2.xreg1" "X2.xreg2" "X2.xreg3" "X2.xreg4" "X2.xreg5" "X2.xreg6"
#> [7] "X2.xreg7" "X2.xreg8" "X2.xreg9" "X2.xreg10" "X2.xreg11" "X2.xreg12"
#> [13] "X2.xreg13" "X2.xreg14" "X2.xreg15" "X2.xreg16" "X2.xreg17" "X2.xreg18"
#> [19] "X2.xreg19" "X2.xreg20" "X2.xreg21" "X2.xreg22" "X2.xreg23" "X2.xreg24"
#> [25] "X2.xreg25" "X2.xreg26" "X2.xreg27" "X2.xreg28" "X2.xreg29" "X2.xreg30"
#> [31] "X2.xreg31" "X2.xreg32" "X2.xreg33" "X2.xreg34" "X2.xreg35" "X2.xreg36"
#> [37] "X2.xreg37" "X2.xreg38" "X2.xreg39" "X2.xreg40"
#>
#>
#> $blocks
#> $blocks$X1
#> $blocks$X1[[1]]
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14
#>
#> $blocks$X1[[2]]
#> [1] 15 16 17 18 19 20 21 22 23 24 25 26 27 28
#>
#> $blocks$X1[[3]]
#> [1] 29 30 31 32 33 34 35 36 37 38 39 40
#>
#>
#> $blocks$X2
#> $blocks$X2[[1]]
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14
#>
#> $blocks$X2[[2]]
#> [1] 15 16 17 18 19 20 21 22 23 24 25 26 27 28
#>
#> $blocks$X2[[3]]
#> [1] 29 30 31 32 33 34 35 36 37 38 39 40
#>
#>
#>
#> $specific.spec
#> $specific.spec$X1
#> integer(0)
#>
#> $specific.spec$X2
#> integer(0)
#>
#>
## ensure regressor no. 3 in matrix no. 2 is not removed:
blocksFun(y, list(x,z), keep=list(integer(0), 3))
#>
#> X1 block 1 of 3:
#> 14 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#> 13
#> 14
#>
#> X1 block 2 of 3:
#> 13 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#> 13
#>
#> X1 block 3 of 3:
#> 12 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#>
#> X2 block 1 of 3:
#> 13 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#> 13
#>
#> X2 block 2 of 3:
#> 14 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#> 13
#> 14
#>
#> X2 block 3 of 3:
#> 12 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#>
#> GETS of union of retained X2 variables...
#>
#> $call
#> blocksFun(y, list(x, z), keep = list(integer(0), 3))
#>
#> $time.started
#> [1] "Sat Jul 27 15:29:47 2024"
#>
#> $time.finished
#> [1] "Sat Jul 27 15:29:47 2024"
#>
#> $y
#> [1] 0.42646422 -0.29507148 0.89512566 0.87813349 0.82158108 0.68864025
#> [7] 0.55391765 -0.06191171 -0.30596266 -0.38047100 -0.69470698 -0.20791728
#> [13] -1.26539635 2.16895597 1.20796200 -1.12310858 -0.40288484 -0.46665535
#> [19] 0.77996512 -0.08336907
#>
#> $x
#> $x$X1
#> [1] "X1.xreg1" "X1.xreg2" "X1.xreg3" "X1.xreg4" "X1.xreg5" "X1.xreg6"
#> [7] "X1.xreg7" "X1.xreg8" "X1.xreg9" "X1.xreg10" "X1.xreg11" "X1.xreg12"
#> [13] "X1.xreg13" "X1.xreg14" "X1.xreg15" "X1.xreg16" "X1.xreg17" "X1.xreg18"
#> [19] "X1.xreg19" "X1.xreg20" "X1.xreg21" "X1.xreg22" "X1.xreg23" "X1.xreg24"
#> [25] "X1.xreg25" "X1.xreg26" "X1.xreg27" "X1.xreg28" "X1.xreg29" "X1.xreg30"
#> [31] "X1.xreg31" "X1.xreg32" "X1.xreg33" "X1.xreg34" "X1.xreg35" "X1.xreg36"
#> [37] "X1.xreg37" "X1.xreg38" "X1.xreg39" "X1.xreg40"
#>
#> $x$X2
#> [1] "X2.xreg1" "X2.xreg2" "X2.xreg3" "X2.xreg4" "X2.xreg5" "X2.xreg6"
#> [7] "X2.xreg7" "X2.xreg8" "X2.xreg9" "X2.xreg10" "X2.xreg11" "X2.xreg12"
#> [13] "X2.xreg13" "X2.xreg14" "X2.xreg15" "X2.xreg16" "X2.xreg17" "X2.xreg18"
#> [19] "X2.xreg19" "X2.xreg20" "X2.xreg21" "X2.xreg22" "X2.xreg23" "X2.xreg24"
#> [25] "X2.xreg25" "X2.xreg26" "X2.xreg27" "X2.xreg28" "X2.xreg29" "X2.xreg30"
#> [31] "X2.xreg31" "X2.xreg32" "X2.xreg33" "X2.xreg34" "X2.xreg35" "X2.xreg36"
#> [37] "X2.xreg37" "X2.xreg38" "X2.xreg39" "X2.xreg40"
#>
#>
#> $blocks
#> $blocks$X1
#> $blocks$X1[[1]]
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14
#>
#> $blocks$X1[[2]]
#> [1] 15 16 17 18 19 20 21 22 23 24 25 26 27 28
#>
#> $blocks$X1[[3]]
#> [1] 29 30 31 32 33 34 35 36 37 38 39 40
#>
#>
#> $blocks$X2
#> $blocks$X2[[1]]
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14
#>
#> $blocks$X2[[2]]
#> [1] 3 15 16 17 18 19 20 21 22 23 24 25 26 27 28
#>
#> $blocks$X2[[3]]
#> [1] 3 29 30 31 32 33 34 35 36 37 38 39 40
#>
#>
#>
#> $keep
#> $keep$X1
#> integer(0)
#>
#> $keep$X2
#> X2.xreg3
#> 3
#>
#>
#> $specific.spec
#> $specific.spec$X1
#> integer(0)
#>
#> $specific.spec$X2
#> [1] "X2.xreg3"
#>
#>