0 votes
1 view
in R Programming by (5k points)

Is there a way to instruct dplyr to use summarise_each with na.rm=TRUE? I would like to take the mean of variables with summarise_each("mean") but I don't know how to specify it to ignore missing values.

1 Answer

0 votes
by (24.7k points)
edited by

You can use the summarise_all function since summarise_each is deprecated.

For example:

library(dplyr)

by_species <- iris %>% group_by(Species)

by_species %>% summarise_all(list( minimum = ~ min(., na.rm = TRUE), maximum = ~ max(., na.rm = TRUE), s_dev = ~ sd(., na.rm = TRUE)))

# A tibble: 3 x 13

  Species Sepal.Length_mi~ Sepal.Width_min~ Petal.Length_mi~ Petal.Width_min~ Sepal.Length_ma~

  <fct>              <dbl>            <dbl>            <dbl>            <dbl>            <dbl>

1 setosa               4.3              2.3              1                0.1              5.8

2 versic~              4.9              2                3                1                7  

3 virgin~              4.9              2.2              4.5              1.4              7.9

# ... with 7 more variables: Sepal.Width_maximum <dbl>, Petal.Length_maximum <dbl>,

#   Petal.Width_maximum <dbl>, Sepal.Length_s_dev <dbl>, Sepal.Width_s_dev <dbl>,

#   Petal.Length_s_dev <dbl>, Petal.Width_s_dev <dbl>

If you want to explore more in R programming then watch this R programming tutorial for beginner:

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...