Sequential estimation of a regression D-vine for the purpose of quantile prediction as described in Kraus and Czado (2017).
Arguments
- formula
an object of class "formula"; same as
lm()
.- data
data frame (or object coercible by
as.data.frame()
) containing the variables in the model.- family_set
see
family_set
argument ofrvinecopulib::bicop()
.- selcrit
selection criterion based on conditional log-likelihood.
"loglik"
(default) imposes no correction; other choices are"aic"
and"bic"
.- order
the order of covariates in the D-vine, provided as vector of variable names (after calling
vinereg:::expand_factors(model.frame(formula, data))
); selected automatically iforder = NA
(default).- par_1d
list of options passed to
kde1d::kde1d()
, must be one value for each margin, e.g.list(xmin = c(0, 0, NaN))
if the response and first covariate have non-negative support.- weights
optional vector of weights for each observation.
- cores
integer; the number of cores to use for computations.
- ...
further arguments passed to
rvinecopulib::bicop()
.- uscale
if TRUE, vinereg assumes that marginal distributions have been taken care of in a preliminary step.
Value
An object of class vinereg. It is a list containing the elements
- formula
the formula used for the fit.
- selcrit
criterion used for variable selection.
- model_frame
the data used to fit the regression model.
- margins
list of marginal models fitted by
kde1d::kde1d()
.- vine
an
rvinecopulib::vinecop_dist()
object containing the fitted D-vine.- stats
fit statistics such as conditional log-likelihood/AIC/BIC and p-values for each variable's contribution.
- order
order of the covariates chosen by the variable selection algorithm.
- selected_vars
indices of selected variables.
Use
predict.vinereg()
to predict conditional quantiles. summary.vinereg()
shows the contribution of each selected variable with the associated
p-value derived from a likelihood ratio test.
Details
If discrete variables are declared as ordered()
or factor()
, they are
handled as described in Panagiotelis et al. (2012). This is different from
previous version where the data was jittered before fitting.
References
Kraus and Czado (2017), D-vine copula based quantile regression, Computational Statistics and Data Analysis, 110, 1-18
Panagiotelis, A., Czado, C., & Joe, H. (2012). Pair copula constructions for multivariate discrete data. Journal of the American Statistical Association, 107(499), 1063-1072.
Examples
# simulate data
x <- matrix(rnorm(200), 100, 2)
y <- x %*% c(1, -2)
dat <- data.frame(y = y, x = x, z = as.factor(rbinom(100, 2, 0.5)))
# fit vine regression model
(fit <- vinereg(y ~ ., dat))
#> D-vine regression model: y | x.2, x.1, z.2, z.1
#> nobs = 100, edf = 6, cll = 62.29, caic = -112.58, cbic = -96.95
# inspect model
summary(fit)
#> var edf cll caic cbic p_value
#> 1 y 0 -217.724919 435.4498379 435.4498379 NA
#> 2 x.2 1 65.533489 -129.0669787 -126.4618085 2.393911e-30
#> 3 x.1 3 211.069184 -416.1383674 -408.3228569 3.544140e-91
#> 4 z.2 1 1.394265 -0.7885304 1.8166398 9.494126e-02
#> 5 z.1 1 2.020211 -2.0404210 0.5647492 4.442274e-02
plot_effects(fit)
#> `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
#> Warning: pseudoinverse used at 0.995
#> Warning: neighborhood radius 1.005
#> Warning: reciprocal condition number 0
#> Warning: There are other near singularities as well. 1.01
#> Warning: pseudoinverse used at 0.995
#> Warning: neighborhood radius 1.005
#> Warning: reciprocal condition number 0
#> Warning: There are other near singularities as well. 1.01
#> Warning: pseudoinverse used at 0.995
#> Warning: neighborhood radius 1.005
#> Warning: reciprocal condition number 0
#> Warning: There are other near singularities as well. 1.01
#> Warning: pseudoinverse used at 0.995
#> Warning: neighborhood radius 1.005
#> Warning: reciprocal condition number 0
#> Warning: There are other near singularities as well. 1.01
#> Warning: pseudoinverse used at 0.995
#> Warning: neighborhood radius 1.005
#> Warning: reciprocal condition number 0
#> Warning: There are other near singularities as well. 1.01
#> Warning: pseudoinverse used at 0.995
#> Warning: neighborhood radius 1.005
#> Warning: reciprocal condition number 0
#> Warning: There are other near singularities as well. 1.01
# model predictions
mu_hat <- predict(fit, newdata = dat, alpha = NA) # mean
med_hat <- predict(fit, newdata = dat, alpha = 0.5) # median
# observed vs predicted
plot(cbind(y, mu_hat))
## fixed variable order (no selection)
(fit <- vinereg(y ~ ., dat, order = c("x.2", "x.1", "z.1")))
#> D-vine regression model: y | x.2, x.1, z.1
#> nobs = 100, edf = 4, cll = 58.88, caic = -109.76, cbic = -99.33