Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Python by (47.6k points)

I am reading a file in python using pandas and then saving it in a numpy array. The file has the dimension of 11303402 rows x 10 columns. I need to split the data for cross-validation and for that I sliced the data into 11303402 rows x 9 columns of examples and 1 array of 11303402 rows x 1 col of labels. The following is the code:

tdata=pd.read_csv('train.csv') tdata.columns='Arrival_Time','Creation_Time','x','y','z','User','Model','Device','sensor','gt'] 

User_Data = np.array(tdata) 

features = User_Data[:,0:9] 

labels = User_Data[:,9:10]

The error comes in the following code:

classes=np.unique(labels) 

idx=labels==classes[0] 

Yt=labels[idx] 

Xt=features[idx,:]

On the line:

Xt=features[idx,:]

it says 'too many indices for array'

The shapes of all 3 data sets are:

print np.shape(tdata) = (11303402, 10) 

print np.shape(features) = (11303402, 9) 

print np.shape(labels) = (11303402, 1)

If anyone knows the problem, please help.

1 Answer

0 votes
by (106k points)
edited by

You can use the below-mentioned code to get rid of the error in Python:-

Xt=features[idx[:,0],:]

To know more about this you can have a look at the following video tutorial:-

31k questions

32.8k answers

501 comments

693 users

Browse Categories

...