I am attempting to compare two columns sar_details_sent_norm_trigrams_ and caap_details_sent_norm_trigrams_ in a Pandas data frame. There are other columns as well, but these are the two I am comparing.

I'm essentially wanting to keep records where the text values for the two columns are the same. I've executed a couple of approaches, however, I keep getting the following error message:

TypeError: unhashable type: 'set'

So, I either need to resolve why I am receiving this and fix it or try another approach, of course. Any advice would be greatly appreciated.


Code snippet:

# Set with unique terms

df_sar['sar_details_sent_norm_trigrams_unique'] = df_sar['sar_details_sent_norm_trigrams_'].apply(lambda x: set([trigram for sent in x for trigram in sent]))

# Set with unique terms

df_caap['caap_details_sent_norm_trigrams_unique'] = df_caap['caap_details_sent_norm_trigrams_'].apply(lambda x: set([trigram for sent in x for trigram in sent]))

#Attempt 1: 


#Attempt 2:


TypeError Traceback (most recent call last) in () 21

set(df1.columns).intersection(set(df2.columns)) 22

---> 23 set(df_caap.caap_details_sent_norm_trigrams_unique).intersection(set(df_sar.sar_details_sent_norm_trigrams_unique))

TypeError: unhashable type: 'set'

The set data type is mutable so calculating the hash on it unsafe since hash has a key. there is a chance of hash changing its data structure since it is mutated which may violate the hashtable invariant. I insist you use the frozenset which is an immutable data structure and also be used as a key in hashtable.

