To replace NA’s in a large data table faster, use the := operator from data.table package as follows:
To create a large data table with 100 columns:
require(data.table)
require("gdata")
create_dt <- function(nrow=5, ncol=5, propNA = 0.5){ v <- runif(nrow * ncol) v[sample(seq_len(nrow*ncol),propNA * nrow*ncol)] <- NA data.table(matrix(v, ncol=ncol)) }
set.seed(123)
dt = create_dt(1e5, 100, 0.1)
dim(dt)
[1] 100000 100
To replace NA’s with Zero’s use the following function:
f_NA = function(DT) {
for (i in names(DT))
DT[is.na(get(i)), (i):=0]
}
To pass the created data frame to function:
f_NA(dt)
This replaces all NA values from the table.