0 votes
1 view
in Data Science by (17.6k points)

i tried to create a pandas dataframe like below

import pandas as pd

import numpy as np

pd.set_option('precision', 20)

a = pd.DataFrame([10212764634169927, 10212764634169927, 10212764634169927], columns=['counts'], dtype=np.float64)

a returns as:

             counts

0  10212764634169928.0

1  10212764634169928.0

2  10212764634169928.0

So, my question is, why is the last digit modified?

Thanks in advance!

EDIT: i understand it has to do with the dtype. But why +1 to the last digit specifically? If i were to use 10212764634169926 instead, nothing happens, the results keeps to 10212764634169926. The same is with 10212764634169928, it returns 10212764634169928

1 Answer

0 votes
by (38.4k points)

The issue is related to float numbers and not with pandas. If you try the following:

The below code will give you an idea about how float numbers are stored in memory through the exponential notation.

float(10212764634169927)

1.0212764634169928e+16

For showing the demo of float32 format that would return  more difference, following test is done on the given values.

a.astype('float64')

                counts

0  10212764634169928.0

1  10212764634169928.0

2  10212764634169928.0

a.astype('float32')

                counts

0  10212764362473472.0

1  10212764362473472.0

2  10212764362473472.0

If you wish to learn more about how to use python for data science, then go through data science python programming course by Intellipaat for more insights.

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...