See the code below:
library(tidyverse)
df1 %>%
mutate(common_words = map2(strsplit(X2, ", "),
strsplit(X3, ", "),
intersect),
count = lengths(common_words))
# X1 X2 X3
#1 1 100020003001, 100020003002, 100020003003 100020003001, 100020003002, 100020003004
#2 2 100020003001, 100020004002, 100020004003 100020003001, 100020004007, 100020004009
#3 3 100050006003, 100050006001, 100050006001 100050006011, 100050006013, 100050006021
# common_words count
#1 100020003001, 100020003002 2
#2 100020003001 1
#3 0
Here, we are splitting the 'X2', 'X3' columns by the ‘,’,
Then, getting the intersect of corresponding list elements with map2 and to ‘count ’ the number of elements in the list, we are using lengths.
Or we can also use base R
df1$common_words <- Map(intersect, strsplit(df1$X2, ", "), strsplit(df1$X3, ", "))
df1$count <- lengths(df1$common_words)
data
df1 <- structure(list(X1 = 1:3, X2 = c("100020003001, 100020003002, 100020003003",
"100020003001, 100020004002, 100020004003", "100050006003,
100050006001, 100050006001"
), X3 = c("100020003001, 100020003002, 100020003004", "100020003001,
100020004007, 100020004009",
"100050006011, 100050006013, 100050006021")), class = "data.frame",
row.names = c("c1", "c2", "c3"))
If you wish to learn more about how to use python for data science, then go through data science python programming course by Intellipaat for more insights.