Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
3 views
in R Programming by (5.3k points)
edited by

I have a data frame as follow:

+-----+-------+

|  V1 |  V2   |

+-----+-------+

|  1  | a,b,c |

|  2  | a,c   |

|  3  | b,d   |

|  4  | e,f   |

|  .  | .     |

+-----+-------+

Each of the alphabets is a character separated by a comma. I would like to split V2 on each comma and insert the split strings as new rows. For instance, the desired output will be:

+----+----+

| V1 | V2 |

+----+----+

|  1 |  a |

|  1 |  b |

|  1 |  c |

|  2 |  a |

|  2 |  c |

|  3 |  b |

|  3 |  d |

|  4 |  e |

|  4 |  f |

+----+----+

I am trying to use strsplit() to spit V2 first, then cast the list into a data frame. It didn't work. Any help will be appreciated.

1 Answer

0 votes
by
edited by

To split delimited strings in a column and insert them as new rows, you can use the strsplit() function as follows:

df <- data.frame(V1 = c(1,2,3,4),V2 = c("a,b,c","a,c","b,d","e,f"),stringsAsFactors = F)

> df

  V1    V2

1  1 a,b,c

2  2   a,c

3  3   b,d

4  4   e,f

To split the strings:

s <- strsplit(df$V2, split = ",")

data.frame(V1 = rep(df$V1, sapply(s, length)), V2 = unlist(s))

Output:

  V1 V2

1  1  a

2  1  b

3  1  c

4  2  a

5  2  c

6  3  b

7  3  d

8  4  e

9  4  f

Related questions

...