Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (18.4k points)

I have the data file looks like this:

data.txt

user,activity,timestamp,x-axis,y-axis,z-axis

0,33,Jogging,49105962326000,-0.6946376999999999,12.680544,0.50395286;

1,33,Jogging,49106062271000,5.012288,11.264028,0.95342433;

2,33,Jogging,49106112167000,4.903325,10.882658000000001,-0.08172209;

3,33,Jogging,49106222305000,-0.61291564,18.496431,3.0237172;

As can be seen, the last column ends with the semicolon, so when I read into the pandas, a column is inferred as the type object (ending with a semicolon.)

df = pd.read_csv('data.txt')

df

    user    activity    timestamp   x-axis  y-axis  z-axis

0   33  Jogging     49105962326000  -0.694638   12.680544   0.50395286;

1   33  Jogging     49106062271000  5.012288    11.264028   0.95342433;

2   33  Jogging     49106112167000  4.903325    10.882658   -0.08172209;

3   33  Jogging     49106222305000  -0.612916   18.496431   3.0237172;

How do I make pandas ignore the semicolon?

1 Answer

0 votes
by (36.8k points)

The problem with the txt is that it has mixed content. As I can see my header doesn't have the semicolon as a termination character

If you want to change your first line by adding the semicolon it's quite simple

pd.read_csv("data.txt", lineterminator=";")

Want to be a master in Data Science? Enroll in this Data Science Courses

 

Browse Categories

...