Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
6 views
in R Programming by (7.3k points)
edited by

There are a lot of posts about replacing NA values. I am aware that one could replace NAs in the following table/frame with the following:

x[is.na(x)]<-0

But, what if I want to restrict it to only certain columns? Let's show you an example.

First, let's start with a dataset.

set.seed(1234)

x <- data.frame(a=sample(c(1,2,NA), 10, replace=T),

                b=sample(c(1,2,NA), 10, replace=T), 

                c=sample(c(1:5,NA), 10, replace=T))

Which gives:

    a  b  c

1   1 NA  2

2   2  2  2

3   2  1  1

4   2 NA  1

5  NA  1  2

6   2 NA  5

7   1  1  4

8   1  1 NA

9   2  1  5

10  2  1  1

Ok, so I only want to restrict the replacement to columns 'a' and 'b'. My attempt was:

x[is.na(x), 1:2]<-0

and:

x[is.na(x[1:2])]<-0

Which does not work.

My data.table attempt, where y<-data.table(x), was obviously never going to work:

y[is.na(y[,list(a,b)]), ]

I want to pass columns inside the is.na argument but that obviously wouldn't work.

I would like to do this in a data.frame and a data.table. My end goal is to recode the 1:2 to 0:1 in 'a' and 'b' while keeping 'c' the way it is since it is not a logical variable. I have a bunch of columns so I don't want to do it one by one. And, I'd just like to know how to do this.

Do you have any suggestions?

1 Answer

0 votes
by
edited by

To replace NA values in a data table for selected columns, you can use the following syntax:

For data.frame:

set.seed(1234)

x <- data.frame(a=sample(c(1,2,NA), 10, replace=TRUE),

                b=sample(c(1,2,NA), 10, replace=TRUE), 

                c=sample(c(1:5,NA), 10, replace=TRUE))

x

    a  b  c

1   2  2  5

2   2  2  2

3   1 NA  5

4  NA  2  2

5   1  2 NA

6   1  2  3

7   2  2  4

8   2  1  4

9  NA NA  3

10  2 NA  1

x[c("a", "b")][is.na(x[c("a", "b")])] <- 0

x

   a b  c

1  2 2  5

2  2 2  2

3  1 0  5

4  0 2  2

5  1 2 NA

6  1 2  3

7  2 2  4

8  2 1  4

9  0 0  3

10 2 0  1

For data.table:

y<-data.table(x)

> for (col in c("a", "b")) y[is.na(get(col)), (col) := 0]

> y

    a b  c

 1: 2 2  5

 2: 2 2  2

 3: 1 0  5

 4: 0 2  2

 5: 1 2 NA

 6: 1 2  3

 7: 2 2  4

 8: 2 1  4

 9: 0 0  3

10: 2 0  1

Related questions

0 votes
1 answer
0 votes
1 answer
0 votes
1 answer

31k questions

32.8k answers

501 comments

693 users

Browse Categories

...