Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (17.6k points)

I have a pandas.DataFrame that I wish to export to a CSV file. However, pandas seems to write some of the values as float instead of int types. I couldn't not find how to change this behavior.

Building a data frame:

df = pandas.DataFrame(columns=['a','b','c','d'], index=['x','y','z'], dtype=int)

x = pandas.Series([10,10,10], index=['a','b','d'], dtype=int)

y = pandas.Series([1,5,2,3], index=['a','b','c','d'], dtype=int)

z = pandas.Series([1,2,3,4], index=['a','b','c','d'], dtype=int)

df.loc['x']=x; df.loc['y']=y; df.loc['z']=z

View it:

>>> df

    a   b    c   d

x  10  10  NaN  10

y   1   5    2   3

z   1   2    3   4

Export it:

>>> df.to_csv('test.csv', sep='\t', na_rep='0', dtype=int)

>>> for l in open('test.csv'): print l.strip('\n')

        a       b       c       d

x       10.0    10.0    0       10.0

y       1       5       2       3

z       1       2       3       4

Why do the tens have a dot zero ?

Sure, I could just stick this function into my pipeline to reconvert the whole CSV file, but it seems unnecessary:

def lines_as_integer(path):

    handle = open(path)

    yield handle.next()

    for line in handle:

        line = line.split()

        label = line[0]

        values = map(float, line[1:])

        values = map(int, values)

        yield label + '\t' + '\t'.join(map(str,values)) + '\n'

handle = open(path_table_int, 'w')

handle.writelines(lines_as_integer(path_table_float))

handle.close()

1 Answer

0 votes
by (41.4k points)

This will solve the problem:

import pandas

    df = pandas.DataFrame(data, columns=['a','b','c','d'], index=['x','y','z'])

    df = df.fillna(0)

    df = df.astype(int)

    df.to_csv('test.csv', sep='\t')

If you wish to learn more about Pandas visit this Pandas Tutorial.

Related questions

Browse Categories

...