0 votes
1 view
in R Programming by (5.3k points)

I'm having trouble with a data frame and couldn't really resolve that issue myself:
The data frame has arbitrary properties as columns and each row represents one data set.

The question is:
How to get rid of columns where for ALL rows the value is NA?

1 Answer

0 votes
by (25.3k points)

To remove columns from the data frame where all values are NA, you can use the select_if function from the dplyr package as follows:

 df <- data.frame(x = 1:10, y = c(1,2,NA,4, 5,NA,7,8,4,NA), z = rep(NA, 10))

> df

    x  y  z

1   1  1 NA

2   2  2 NA

3   3 NA NA

4   4  4 NA

5   5  5 NA

6   6 NA NA

7   7  7 NA

8   8  8 NA

9   9  4 NA

10 10 NA NA

To remove column ‘z’(All NA’s):

library(dplyr)

all_na <- function(x) any(!is.na(x))

 df %>% select_if(all_na)

    x  y

1   1  1

2   2  2

3   3 NA

4   4  4

5   5  5

6   6 NA

7   7  7

8   8  8

9   9  4

10 10 NA

You can also use the lapply function as follows:

 df[,which(unlist(lapply(df, function(x) !all(is.na(x)))))]

    x  y

1   1  1

2   2  2

3   3 NA

4   4  4

5   5  5

6   6 NA

7   7  7

8   8  8

9   9  4

10 10 NA

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...