0 votes
1 view
in Data Science by (17.6k points)

I have a uniform distribution in a pandas dataframe column with a few NaN values I'd like to replace.

Since the data is uniformly distributed, I decided that I would like to fill the null values with random uniform samples drawn from a range of the column's min and max values. I used the following code to get the random uniform sample:

df_copy['ep'] = df_copy['ep'].fillna(value=np.random.uniform(3, 331))

Of course, using pd.DafaFrame.fillna() replaces all existing NaNs with the same value. I would like each NaN to be a different value. I assume that a for loop could get the job done, but am unsure how to create such a loop to specifically handle these NaN values. Thanks for the help!

1 Answer

0 votes
by (32.5k points)

This implementation would work perfectly on a DataFrame:

Sample Data:

series = pd.Series(range(100))

series.loc[2] = np.nan

series.loc[10:15] = np.nan

Solution:

series.mask(series.isnull(), np.random.uniform(3, 331, size=series.shape))

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...