Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
3 views
in R Programming by (7.3k points)

I am having some troubles with leading and trailing whitespace in a data.frame. Eg I like to take a look at a specific row in a data.frame based on a certain condition:

> myDummy[myDummy$country == c("Austria"),c(1,2,3:7,19)] 

[1] codeHelper     country        dummyLI    dummyLMI       dummyUMI       

[6] dummyHInonOECD dummyHIOECD    dummyOECD      

<0 rows> (or 0-length row.names)

I was wondering why I didn't get the expected output since the country Austria obviously existed in my data .frame. After looking through my code history and trying to figure out what went wrong I tried:

> myDummy[myDummy$country == c("Austria "),c(1,2,3:7,19)]

   codeHelper  country dummyLI dummyLMI dummyUMI dummyHInonOECD dummyHIOECD

18        AUT Austria        0        0        0              0           1

   dummyOECD

18         1

All I have changed in the command is additional whitespace after Austria.

Further annoying problems obviously arise. Eg when I like to merge two frames based on the country column. One data.frame uses "Austria " while the other frame has "Austria". The matching doesn't work.

Is there a nice way to 'show' the whitespace on my screen so that I am aware of the problem?

And can I remove the leading and trailing whitespace in R?

So far I used to write a simple Perl script which removes the whitespace but it would be nice if I can somehow do it inside R.

1 Answer

0 votes
by

To trim leading and trailing whitespace, you can use the str_trim() function from the stringr package as follows:

install.packages("stringr", dependencies=TRUE)

 

library(stringr)

To create a data frame with whitespaces:

anim <- c(" Hi "," i  ","Am","sam ","from ","Abs ")

sex  <- c(1,2,2,1,2,1)

wt   <- c(0.8,1.2,1.0,2.0,1.8,1.4)

data <- data.frame(anim,sex,wt)

To trim whitespaces:

data$anim <- str_trim(data$anim)

 

data

 Output:

    anim   sex  wt

1   Hi     1  0.8

2   i      2  1.2

3   Am     2  1.0

4  sam     1  2.0

5  from    2  1.8

6  Abs     1  1.4

Browse Categories

...