# creating pandas dataframe with dtype float64 changes last digit of its entry (a fairly large number)

0 votes
1 view

i tried to create a pandas dataframe like below

import pandas as pd

import numpy as np

pd.set_option('precision', 20)

a = pd.DataFrame([10212764634169927, 10212764634169927, 10212764634169927], columns=['counts'], dtype=np.float64)

a returns as:

counts

0  10212764634169928.0

1  10212764634169928.0

2  10212764634169928.0

So, my question is, why is the last digit modified?

Thanks in advance!

EDIT: i understand it has to do with the dtype. But why +1 to the last digit specifically? If i were to use 10212764634169926 instead, nothing happens, the results keeps to 10212764634169926. The same is with 10212764634169928, it returns 10212764634169928

## 1 Answer

0 votes
by (41.4k points)

The issue is related to float numbers and not with pandas. If you try the following:

The below code will give you an idea about how float numbers are stored in memory through the exponential notation.

float(10212764634169927)

1.0212764634169928e+16

For showing the demo of float32 format that would return  more difference, following test is done on the given values.

a.astype('float64')

counts

0  10212764634169928.0

1  10212764634169928.0

2  10212764634169928.0

a.astype('float32')

counts

0  10212764362473472.0

1  10212764362473472.0

2  10212764362473472.0

If you wish to learn more about how to use python for data science, then go through data science python programming course by Intellipaat for more insights.

0 votes
1 answer
+9 votes
1 answer
0 votes
1 answer
0 votes
1 answer
0 votes
1 answer