In the data frame, I want the means of each column. The issue is some columns have a bunch of zeroes on the bottom and these need to be neglected.

I can ignore the zeroes and look at one column with:

mean(which(df$colname >0))

But I want a vector of every column's mean, gotten with sapply. Is there a reliable way to ignore the zeroes and get these values within a sapply function?

1 Answer

You can simply use :

sapply(df, function(x) mean(x[x != 0], na.rm = TRUE))

Or using dplyr :


df %>% summarise_all(~mean(.[. != 0], na.rm = TRUE))

A better/efficient approach would be to set all 0 values to NA and use colMeans

df[df == 0] <- NA

colMeans(df, na.rm = TRUE)

