Hi all,
quick update on what happened in vimpute on the feature branch (feature/vimpute-mi-bootstrap).
Multiple Imputation (m > 1)
vimpute now supports proper multiple imputation. Set m = 5 and you get back a vimmi object (our own lightweight S3 class, inspired by mice's mids but without the dependency). It stores the original data once and only the imputed values per variable per imputation – memory efficient. There's complete(result, 1) to extract datasets, complete(result, "long") for long
format, with(result, lm(y ~ x)) for fitting models across imputations, and as.mids.vimmi() if you want to convert to mice for pooling. When m = 1 (default), everything works exactly as before – no breaking changes.
Bootstrap & Uncertainty
Two new layers for proper between-imputation variability:
- boot = TRUE with robustboot parameter: 5 bootstrap strategies (standard, stratified, quantile, residual, psi/Tukey bisquare). Refits the model on bootstrapped training data for model
uncertainty.
- uncert parameter: normalerror (add N(0, sigma)), resid (add sampled residual), pmm (predictive mean matching), midastouch (covariate-distance-weighted PMM à la Siddique & Belin 2008).
These two are orthogonal – you can use either or both. Existing pmm = TRUE still works and takes precedence.
method = "gam"
New! Uses mgcv::gam() under the hood, wrapped as custom mlr3 learners. Numeric predictors are automatically wrapped in s() terms, factors stay linear. If you pass a formula via the existing formula parameter, that's used instead (full control over smooth terms). Works for both regression and classification (binary via binomial, multiclass via One-vs-Rest).
method = "robgam"
Also new. Robust GAM with two strategies, controlled via learner_params = list(robgam = list(robust_method = "simple")):
- "simple" (default): fit GAM, identify outliers by residual quantile (alpha = 0.75), refit on clean subset
- "irw": iterative reweighting with Tukey bisquare weights until convergence
Same classification support as gam (binary + multiclass). The alpha parameter and robust_method go through learner_params, so no new top-level parameters.
What's next?
Branch is ready for review and to be integrated in origin/main. In near future I will soft-deprecate imputeRobustChain over time, since vimpute with method = "robust" + sequential = TRUE + boot + uncert now covers the same ground (and more). imputeRobust stays as-is for backward compatibility.
Hi all,
quick update on what happened in vimpute on the feature branch (feature/vimpute-mi-bootstrap).
Multiple Imputation (m > 1)
vimpute now supports proper multiple imputation. Set m = 5 and you get back a vimmi object (our own lightweight S3 class, inspired by mice's mids but without the dependency). It stores the original data once and only the imputed values per variable per imputation – memory efficient. There's complete(result, 1) to extract datasets, complete(result, "long") for long
format, with(result, lm(y ~ x)) for fitting models across imputations, and as.mids.vimmi() if you want to convert to mice for pooling. When m = 1 (default), everything works exactly as before – no breaking changes.
Bootstrap & Uncertainty
Two new layers for proper between-imputation variability:
uncertainty.
These two are orthogonal – you can use either or both. Existing pmm = TRUE still works and takes precedence.
method = "gam"
New! Uses mgcv::gam() under the hood, wrapped as custom mlr3 learners. Numeric predictors are automatically wrapped in s() terms, factors stay linear. If you pass a formula via the existing formula parameter, that's used instead (full control over smooth terms). Works for both regression and classification (binary via binomial, multiclass via One-vs-Rest).
method = "robgam"
Also new. Robust GAM with two strategies, controlled via learner_params = list(robgam = list(robust_method = "simple")):
Same classification support as gam (binary + multiclass). The alpha parameter and robust_method go through learner_params, so no new top-level parameters.
What's next?
Branch is ready for review and to be integrated in origin/main. In near future I will soft-deprecate imputeRobustChain over time, since vimpute with method = "robust" + sequential = TRUE + boot + uncert now covers the same ground (and more). imputeRobust stays as-is for backward compatibility.