Back

Explore Courses Blog Tutorials Interview Questions
0 votes
3 views
in R Programming by (50.2k points)

I am having the following dataframe:

data <- data.frame(

  text = c(

    "PARACETAMOL/CODEINE",

    "PSEUDOEPH/PARACET/CODEINE",

    "PARACETAMOL/CODEINE/DOXYLAMINE",

    "CODEINE & ASPIRIN",

    "CODEINE/PARACETAMOL"

  ),

  stringsAsFactors = F

)

And I just wanted to return the position of CODEINE with respect to each case:, i.e

text                             position

PARACETAMOL/CODEINE                     2

PSEUDOEPH/PARACET/CODEINE               3

PARACETAMOL/CODEINE/DOXYLAMINE          2

CODEINE & ASPIRIN                       1

CODEINE/PARACETAMOL                     1

I prefer a DPLYR solution to run over hundreds of rows.

Any help would be greatly appreciated.

1 Answer

0 votes
by (108k points)

As per my knowledge, I think there is a direct regex solution in R programming. You can simply split the string into different words and count the word number where "CODEINE" occurs.

library(dplyr)

data %>%

  mutate(text1 = stringr::str_extract_all(text, "\\w+"), 

         position = purrr::map_int(text1, ~which(.x == "CODEINE"))) %>%

  select(-text1)

#                            text position

#1            PARACETAMOL/CODEINE        2

#2      PSEUDOEPH/PARACET/CODEINE        3

#3 PARACETAMOL/CODEINE/DOXYLAMINE        2

#4              CODEINE & ASPIRIN        1

#5            CODEINE/PARACETAMOL        1

Browse Categories

...