0 votes
1 view
in R Programming by (5.3k points)

I want to know how to omit NA values in a data frame, but only in some columns, I am interested in.

For example,

DF <- data.frame(x = c(1, 2, 3), y = c(0, 10, NA), z=c(NA, 33, 22))

but I only want to omit the data where y is NA, therefore the result should be

  x  y  z

1 1  0 NA

2 2 10 33

na.omit seems delete all rows contain any NA.

Can somebody help me out of this simple question?

But if now I change the question like:

DF <- data.frame(x = c(1, 2, 3,NA), y = c(1,0, 10, NA), z=c(43,NA, 33, NA))

If I want to omit only x=na or z=na, where can I put the | in function?

1 Answer

0 votes
by (25.3k points)

To omit rows containing a specific column of NA’s, you can use the following methods:

 

Using is.na() function:

DF <- data.frame(x = c(1, 2, 3), y = c(0, 10, NA), z=c(NA, 33, 22))

> DF

  x  y  z

1 1  0 NA

2 2 10 33

3 3 NA 22

> DF[!is.na(DF$y),]

  x  y  z

1 1  0 NA

2 2 10 33

Using drop_na function from tidyr package:

library(tidyr)

DF %>% drop_na(y)

  x  y  z

1 1  0 NA

2 2 10 33

Using complete.cases :

 DF[complete.cases(DF[, "y"]),]

  x  y  z

1 1  0 NA

2 2 10 33

 

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...