Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Python by (16.4k points)

I'm accomplishing some statistical work utilizing Python's pandas and I'm having the accompanying code to print out the information description (mean, median, count, and so on) 

data=pandas.read_csv(input_file)

print(data.describe())

However, my information is huge (around 4 million lines) and each rows has exceptionally little information. So unavoidably, the check would be huge and the mean would be minuscule, and accordingly Python print it like this.

I simply need to print these numbers altogether only for usability and understanding, for instance, it should be 4393476 rather than 4.393476e+06. I have also googled it around and the most I can discover is to Display a float with two decimal spots in Python and some other comparative posts. In any case, that will possibly work just on the off chance that I have the numbers in a variable as of now. Not for my situation, however. For my situation, I haven't got those numbers. The numbers are made by the portray() function, so I don't have a clue what numbers I will get.

Can anyone please help me? Thanks in advance.

1 Answer

0 votes
by (26.4k points)

Let's say you have the accompanying DataFrame:

I checked the docs and you ought to likely utilize the pandas.set_option API to do this: 

In [13]: df

Out[13]: 

              a             b             c

0  4.405544e+08  1.425305e+08  6.387200e+08

1  8.792502e+08  7.135909e+08  4.652605e+07

2  5.074937e+08  3.008761e+08  1.781351e+08

3  1.188494e+07  7.926714e+08  9.485948e+08

4  6.071372e+08  3.236949e+08  4.464244e+08

5  1.744240e+08  4.062852e+08  4.456160e+08

6  7.622656e+07  9.790510e+08  7.587101e+08

7  8.762620e+08  1.298574e+08  4.487193e+08

8  6.262644e+08  4.648143e+08  5.947500e+08

9  5.951188e+08  9.744804e+08  8.572475e+08

In [14]: pd.set_option('float_format', '{:f}'.format)

In [15]: df

Out[15]: 

                 a                b                c

0 440554429.333866 142530512.999182 638719977.824965

1 879250168.522411 713590875.479215  46526045.819487

2 507493741.709532 300876106.387427 178135140.583541

3  11884941.851962 792671390.499431 948594814.816647

4 607137206.305609 323694879.619369 446424361.522071

5 174424035.448168 406285189.907148 445616045.754137

6  76226556.685384 979050957.963583 758710090.127867

7 876261954.607558 129857447.076183 448719292.453509

8 626264394.999419 464814260.796770 594750038.747595

9 595118819.308896 974480400.272515 857247528.610996

In [16]: df.describe()

Out[16]: 

                     a                b                c

count        10.000000        10.000000        10.000000

mean  479461624.877280 522785202.100082 536344333.626082

std   306428177.277935 320806568.078629 284507176.411675

min    11884941.851962 129857447.076183  46526045.819487

25%   240956633.919592 306580799.695412 445818124.696121

50%   551306280.509214 435549725.351959 521734665.600552

75%   621482597.825966 772901261.744377 728712562.052142

max   879250168.522411 979050957.963583 948594814.816647

In [7]: df

Out[7]: 

              a             b             c

0  4.405544e+08  1.425305e+08  6.387200e+08

1  8.792502e+08  7.135909e+08  4.652605e+07

2  5.074937e+08  3.008761e+08  1.781351e+08

3  1.188494e+07  7.926714e+08  9.485948e+08

4  6.071372e+08  3.236949e+08  4.464244e+08

5  1.744240e+08  4.062852e+08  4.456160e+08

6  7.622656e+07  9.790510e+08  7.587101e+08

7  8.762620e+08  1.298574e+08  4.487193e+08

8  6.262644e+08  4.648143e+08  5.947500e+08

9  5.951188e+08  9.744804e+08  8.572475e+08

In [8]: df.describe()

Out[8]: 

                  a             b             c

count  1.000000e+01  1.000000e+01  1.000000e+01

mean   4.794616e+08  5.227852e+08  5.363443e+08

std    3.064282e+08  3.208066e+08  2.845072e+08

min    1.188494e+07  1.298574e+08  4.652605e+07

25%    2.409566e+08  3.065808e+08  4.458181e+08

50%    5.513063e+08  4.355497e+08  5.217347e+08

75%    6.214826e+08  7.729013e+08  7.287126e+08

max    8.792502e+08  9.790510e+08  9.485948e+08

You need to mess with the pandas.options.display.float_format property. Note, in my code I've utilized import pandas as pd. A handy solution is a like thing:

In [29]: pd.options.display.float_format = "{:.2f}".format

In [10]: df

Out[10]: 

             a            b            c

0 440554429.33 142530513.00 638719977.82

1 879250168.52 713590875.48  46526045.82

2 507493741.71 300876106.39 178135140.58

3  11884941.85 792671390.50 948594814.82

4 607137206.31 323694879.62 446424361.52

5 174424035.45 406285189.91 445616045.75

6  76226556.69 979050957.96 758710090.13

7 876261954.61 129857447.08 448719292.45

8 626264395.00 464814260.80 594750038.75

9 595118819.31 974480400.27 857247528.61

In [11]: df.describe()

Out[11]: 

                 a            b            c

count        10.00        10.00        10.00

mean  479461624.88 522785202.10 536344333.63

std   306428177.28 320806568.08 284507176.41

min    11884941.85 129857447.08  46526045.82

25%   240956633.92 306580799.70 445818124.70

50%   551306280.51 435549725.35 521734665.60

75%   621482597.83 772901261.74 728712562.05

max   879250168.52 979050957.96 948594814.82

Want to become an expert in Python? Join the python course fast!

31k questions

32.8k answers

501 comments

693 users

Browse Categories

...