Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (17.6k points)

i tried to create a pandas dataframe like below

import pandas as pd

import numpy as np

pd.set_option('precision', 20)

a = pd.DataFrame([10212764634169927, 10212764634169927, 10212764634169927], columns=['counts'], dtype=np.float64)

a returns as:

             counts

0  10212764634169928.0

1  10212764634169928.0

2  10212764634169928.0

So, my question is, why is the last digit modified?

Thanks in advance!

EDIT: i understand it has to do with the dtype. But why +1 to the last digit specifically? If i were to use 10212764634169926 instead, nothing happens, the results keeps to 10212764634169926. The same is with 10212764634169928, it returns 10212764634169928

1 Answer

0 votes
by (41.4k points)

The issue is related to float numbers and not with pandas. If you try the following:

The below code will give you an idea about how float numbers are stored in memory through the exponential notation.

float(10212764634169927)

1.0212764634169928e+16

For showing the demo of float32 format that would return  more difference, following test is done on the given values.

a.astype('float64')

                counts

0  10212764634169928.0

1  10212764634169928.0

2  10212764634169928.0

a.astype('float32')

                counts

0  10212764362473472.0

1  10212764362473472.0

2  10212764362473472.0

If you wish to learn more about how to use python for data science, then go through data science python programming course by Intellipaat for more insights.

Browse Categories

...