0 votes
1 view
in Machine Learning by (32.8k points)

Assume I have a pandas DataFrame with two columns, A and B. I'd like to modify this DataFrame (or create a copy) so that B is always NaN whenever A is 0. How would I achieve that?

I tried the following

df['A'==0]['B'] = np.nan

and

df['A'==0]['B'].values.fill(np.nan)

without success.

1 Answer

0 votes
by (32.8k points)

You should use .loc for label based indexing in python.

For example:

df.loc[df.A==0, 'B'] = np.nan

The df.A==0 expression in the above code creates a boolean series that indexes the rows, 'B' selects the column. You can also use this to transform a subset of a column, e.g.:

df.loc[df.A==0, 'B'] = df.loc[df.A==0, 'B'] / 2

It returns a copy of the modified dataframe. If you want to save this copy, then you can assign this value to a new dataframe.

For example:

Df1 = df.loc[df.A==0, 'B'] = df.loc[df.A==0, 'B'] / 2

Hope this answer helps.

...