Explore Courses Blog Tutorials Interview Questions
0 votes
in R Programming by (50.2k points)

In the data frame, I want the means of each column. The issue is some columns have a bunch of zeroes on the bottom and these need to be neglected.

I can ignore the zeroes and look at one column with:

mean(which(df$colname >0))

But I want a vector of every column's mean, gotten with sapply. Is there a reliable way to ignore the zeroes and get these values within a sapply function?

1 Answer

0 votes
by (108k points)

You can simply use :

sapply(df, function(x) mean(x[x != 0], na.rm = TRUE))

Or using dplyr :


df %>% summarise_all(~mean(.[. != 0], na.rm = TRUE))

A better/efficient approach would be to set all 0 values to NA and use colMeans

df[df == 0] <- NA

colMeans(df, na.rm = TRUE)

If you are a beginner and want to know more about R then do check out the R programming tutorial

Browse Categories