I am attempting to compare two columns sar_details_sent_norm_trigrams_ and caap_details_sent_norm_trigrams_ in a Pandas data frame. There are other columns as well, but these are the two I am comparing.
I'm essentially wanting to keep records where the text values for the two columns are the same. I've executed a couple of approaches, however, I keep getting the following error message:
TypeError: unhashable type: 'set'
So, I either need to resolve why I am receiving this and fix it or try another approach, of course. Any advice would be greatly appreciated.
Thanks.
Code snippet:
# Set with unique terms
df_sar['sar_details_sent_norm_trigrams_unique'] = df_sar['sar_details_sent_norm_trigrams_'].apply(lambda x: set([trigram for sent in x for trigram in sent]))
# Set with unique terms
df_caap['caap_details_sent_norm_trigrams_unique'] = df_caap['caap_details_sent_norm_trigrams_'].apply(lambda x: set([trigram for sent in x for trigram in sent]))
#Attempt 1:
df_caap[df_caap.caap_details_sent_norm_trigrams_unique.isin(df_sar.sar_details_sent_norm_trigrams_unique)]
#Attempt 2:
set(df_caap.caap_details_sent_norm_trigrams_unique).intersection(set(df_sar.sar_details_sent_norm_trigrams_unique))
TypeError Traceback (most recent call last) in () 21
set(df1.columns).intersection(set(df2.columns)) 22
---> 23 set(df_caap.caap_details_sent_norm_trigrams_unique).intersection(set(df_sar.sar_details_sent_norm_trigrams_unique))
TypeError: unhashable type: 'set'