Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in R Programming by (7.3k points)

Here's a little piece of code I wrote to report variables with missing values from a data frame. I'm trying to think of a more elegant way to do this, one that perhaps returns a data.frame, but I'm stuck:

for (Var in names(airquality)) {

    missing <- sum(is.na(airquality[,Var]))

    if (missing > 0) {

        print(c(Var,missing))

    }

}

Edit: I'm dealing with data.frames with dozens to hundreds of variables, so it's key that we only report variables with missing values.

1 Answer

0 votes
by

You can use the sapply function as follows

sapply(airquality, function(x) sum(is.na(x)))

  Ozone Solar.R    Wind    Temp   Month     Day 

     37       7       0       0       0       0 

You can also use the apply or colSums function on the matrix created by is.na().i.e.,

apply(is.na(airquality),2,sum)

  Ozone Solar.R    Wind    Temp   Month     Day 

     37       7       0       0       0       0 

colSums(is.na(airquality))

  Ozone Solar.R    Wind    Temp   Month     Day 

     37       7       0       0       0       0 

Browse Categories

...