I have a dataframe that looks as the following:
ip_address malware_type
ip_1 malware_1
ip_2 malware_2
ip_1 malware_1
ip_1 malware_1
ip_1 malware_2
ip_2 malware_2
ip_2 malware_3
.
.
.
I want to drop duplicate rows based on the 'ip_address' column, however, when I dropping occurs, I want to keep only the 'malware_type' value that is the most frequent for each IP. So the resulting data frame should look like:
ip_address malware_type
ip_1 malware_1
ip_2 malware_2
.