0 votes
1 view
in R Programming by (7.8k points)

I am trying to use grep to test whether a vector of strings are present in another vector or not, and to output the values that are present (the matching patterns).

I have a data frame like this:

FirstName Letter   

Alex      A1

Alex      A6

Alex      A7

Bob       A1

Chris     A9

Chris     A6

I have a vector of strings patterns to be found in the "Letter" columns, for example: c("A1", "A9", "A6").

I would like to check whether any of the strings in the pattern vector is present in the "Letter" column. If they are, I would like the output of unique values.

The problem is, I don't know how to use grep with multiple patterns. I tried:

matches <- unique (

    grep("A1| A9 | A6", myfile$Letter, value=TRUE, fixed=TRUE)

)

But it gives me 0 matches which are not true, any suggestions?

1 Answer

0 votes
by (25.3k points)

To find multiple patterns using grep function, you can use the following syntax:

df <- data.frame(FirstName=c("Alex","Alex","Alex","Bob","Chris","Chris"),

Letter=c("A1","A6","A7","A1","A9","A6"))

patterns <- c("A1", "A9", "A6")

matches <- unique (

  grep("A1|A9|A6", df$Letter, value=TRUE)

)

matches

[1] "A1" "A6" "A9"

You can also use the filter function from the dplyr package as follows:

result <- filter(df, grepl(paste(patterns, collapse="|"), Letter))

  FirstName Letter

1      Alex     A1

2      Alex     A6

3       Bob     A1

4     Chris     A9

5     Chris     A6

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...