Back

Explore Courses Blog Tutorials Interview Questions
0 votes
3 views
in Python by (16.4k points)

Let's say I have a .dat file that has millions of lines of rows and 12 columns in total. I need to have some calculations, for that purpose I need to divide columns number 2,3, and 4 with column number 1. So, Before loading the .dat file, Whether I need to delete all other unwanted columns? 

Take data.dat as an example for .dat file

Since I'm new to python, It would be appreciated if the instruction on open, read and calculation is explained.

I have also added the code, have a look at it:

from sys import argv

import pandas as pd

script, filename = argv

txt = open(filename)

print "Here's your file %r:" % filename

print txt.read()

def your_func(row):

    return row['x-momentum'] / row['mass']

columns_to_keep = ['mass', 'x-momentum']

dataframe = pd.read_csv('~/Pictures', delimiter="," , usecols=columns_to_keep)

dataframe['new_column'] = dataframe.apply(your_func, axis=1)

The error which I get:

Traceback (most recent call last):

  File "flash.py", line 18, in <module>

    dataframe = pd.read_csv('~/Pictures', delimiter="," , usecols=columns_to_keep)

  File "/home/trina/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.py", line 529, in parser_f

    return _read(filepath_or_buffer, kwds)

  File "/home/trina/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.py", line 295, in _read

    parser = TextFileReader(filepath_or_buffer, **kwds)

  File "/home/trina/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.py", line 612, in __init__

    self._make_engine(self.engine)

  File "/home/trina/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.py", line 747, in _make_engine

    self._engine = CParserWrapper(self.f, **self.options)

  File "/home/trina/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.py", line 1119, in __init__

    self._reader = _parser.TextReader(src, **kwds)

  File "pandas/parser.pyx", line 518, in pandas.parser.TextReader.__cinit__ (pandas/parser.c:5030)

ValueError: No columns to parse from file

1 Answer

0 votes
by (26.4k points)

Had a glance at your "flash.dat" file, Before processing it, you actually need to do some cleanup.

The below code will help you to convert it to a CSV

import csv

# read flash.dat to a list of lists

datContent = [i.strip().split() for i in open("./flash.dat").readlines()]

# write it as a new CSV file

with open("./flash.csv", "wb") as f:

    writer = csv.writer(f)

    writer.writerows(datContent)

Now, To compute a new column we will use Pandas

import pandas as pd

def your_func(row):

    return row['x-momentum'] / row['mass']

columns_to_keep = ['#time', 'x-momentum', 'mass']

dataframe = pd.read_csv("./flash.csv", usecols=columns_to_keep)

dataframe['new_column'] = dataframe.apply(your_func, axis=1)

print dataframe

Want to become a Python Expert? Come & Join: Python certification course

To learn more about python, Do check out:

Browse Categories

...