+1 vote
1 view
in R Programming by (5.3k points)

Suppose I have a list of string: string = c("G1:E001", "G2:E002", "G3:E003"). Now I hope to get a vector of string that contains only the parts after the colon ":", i.e substring = c(E001,E002,E003). Is there a convenient way in R to do this? Using substr? Thanks!

1 Answer

+1 vote
by (25.3k points)

To extract a substring from a string according to a pattern, you can use the following functions:

string = c("G1:E001", "G2:E002", "G3:E003")

 substring(string, 4)

[1] "E001" "E002" "E003"

This extracts the string from the fourth character which is true in the above vector.

 substring(string, regexpr(":", string) + 1)

[1] "E001" "E002" "E003"

You can also use the separate function from the tidyr package as follows:




df <- data.frame(string)

df %>% 

   separate(string, into = c("pre", "post")) %>% 


[1] "E001" "E002" "E003"

You can also use the str_extract function from the stringr package as follows:


str_extract(string = string, pattern = "E[0-9]+")

[1] "E001" "E002" "E003"

Welcome to Intellipaat Community. Get your technical queries answered by top developers !