0 votes
1 view
in Machine Learning by (19k points)

I have this code, I want to remove the column 'timestamp' from the file: u.data but can't. It shows the error

"ValueError: labels ['timestamp'] not contained in axis"

How can I correct it

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt 

plt.rc("font", size=14)

from sklearn.linear_model import LinearRegression

from sklearn.linear_model import Ridge

from sklearn.cross_validation import KFold

from sklearn.cross_validation import train_test_split


data = pd.read_table('u.data')

data.columns=['userID', 'itemID','rating', 'timestamp']

data.drop('timestamp', axis=1)


N = len(data)

print data.shape

print list(data.columns)

print data.head(10)

1 Answer

0 votes
by (33.2k points)

The main cause that you are having this problem is the formatting of the u.data file. While inserting headers, the separation should be exactly the same as the format of separation between a row of data.  For example, if space is used to separate a tuple then you should not use tabs.

You should modify your data file. In your u.data file add headers and separate them exactly with as many whitespaces as were used between the items of a row. I suggest using sublime text because notepad/notepad++ does not work sometimes.

Hope this answer helps.

Welcome to Intellipaat Community. Get your technical queries answered by top developers !