Skip to content

Improve ROMM within addNoise by RegSDC methodology #271

@olangsrud

Description

@olangsrud

It seems that ROMM within addNoise is implemented in a way not preserving sample means. Below I suggest how to fix this and speed up the calculations remarkably by utilizing methodology in a recent paper. See https://github.com/olangsrud/RegSDC (hopefully soon on CRAN). Below, I will refer to the functions in that package.

y <- testdata[sample(NROW(testdata), 100), c("expend", "income", "savings")]
addNoise(y, method = "ROMM")$xm

# An almost identical (read about sequentially phenomenon in paper for minor differences) method is  

RegSDCromm(y, lambda = 0.001, ensureIntercept = FALSE)

# This can be viewed as a high-speed version of the current implementation in addNoise.
# Sample means is preserved by the default method where ensureIntercept = TRUE.
# Other values of lambda may be used. 

RegSDCromm(y, lambda = 0.001)

# This is equivalent to calling a more general function 

RegSDCgen(y, lambda = 0.001, makeunique = TRUE)

# The parameter makeunique is of minor importance, but must be TRUE if exact distributional behaviour 
# is important (sample form RegSDCromm several times). So setting makeunique to FALSE can be OK. 

# Feel free to import/wrap functions from  RegSDC within sdcMicro.  
# However, this line 

RegSDCgen(y, lambda = 0.001, makeunique = FALSE)

# can be implemented without using RegSDC by 

lambda <- 0.001
y <- as.matrix(y)
Mean <- function(x) t(matrix(colMeans(x), ncol(x), nrow(x)))
qr1 <- qr(y - Mean(y))
qr1Q <- qr.Q(qr1)
z <- qr1Q + lambda * matrix(rnorm(length(qr1Q)), nrow(y))
qr2 <- qr(z - Mean(z))
Mean(y) + qr.Q(qr2) %*% qr.R(qr1)

# Here Mean can be replaced in several ways. The difference from the result using RegSDCgen is at the 
# level of numerical precision (use set.seed to see).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions