I have 2 dataframes, one of which has supplemental information for some (but not all) of the rows in the other.
names = df({'names':['bob','frank','james','tim','ricardo','mike','mark','joan','joe'],
'position':['dev','dev','dev','sys','sys','sys','sup','sup','sup']})
info = df({'names':['joe','mark','tim','frank'],
'classification':['thief','thief','good','thief']})
I would like to take the classification column from the info dataframe above and add it to the names dataframe above. However, when I do combined = pd.merge(names, info) the resulting dataframe is only 4 rows long. All of the rows that do not have supplemental info are dropped.
Ideally, I would have the values in those missing columns set to unknown. Resulting in a dataframe where some people are theives, some are good, and the rest are unknown.
EDIT: One of the first answers I received suggested using merge outter which seems to do some weird things. Here is a code sample:
names = df({'names':['bob','frank','bob','bob','bob''james','tim','ricardo','mike','mark','joan','joe'],
'position':['dev','dev','dev','dev','dev','dev''sys','sys','sys','sup','sup','sup']})
info = df({'names':['joe','mark','tim','frank','joe','bill'],
'classification':['thief','thief','good','thief','good','thief']})
what = pd.merge(names, info, how="outer")
what.fillna("unknown")
The strange thing is that in the output I'll get a row where the resulting name is "bobjames" and another where position is "devsys". Finally, even though bill does not appear in the names dataframe it shows up in the resulting dataframe. So I really need a way to say lookup a value in this other dataframe and if you find something tack on those columns.