Back

Explore Courses Blog Tutorials Interview Questions
+2 votes
1 view
in Machine Learning by (33.1k points)

I am trying to encode some information to read into a Machine Learning model using the following

import numpy as np

import pandas as pd

import matplotlib.pyplot as py

Dataset = pd.read_csv('filename.csv', sep = ',')

X = Dataset.iloc[:,:-1].values

Y = Dataset.iloc[:,18].values

from sklearn.preprocessing import LabelEncoder, OneHotEncoder

labelencoder_X = LabelEncoder()

X[:, 0] = labelencoder_X.fit_transform(X[:, 0])

onehotencoder = OneHotEncoder(categorical_features = [0])

X = onehotencoder.fit_transform(X).toarray()

however, I am getting an error that reads

runfile('C:/Users/name/Desktop/Machine Learning/Data preprocessing      template.py', wdir='C:/Users/taylorr2/Desktop/Machine Learning')

Traceback (most recent call last):

  File "<ipython-input-141-a5d1cd02c2df>", line 1, in <module>

    runfile('C:/Users/name/Desktop/Machine Learning/Data preprocessing  template.py', wdir='C:/Users/taylorr2/Desktop/Machine Learning')

IndexError: single positional indexer is out-of-bounds

I read a question on here regarding the same error and have tried

import numpy as np

import pandas as pd

import matplotlib.pyplot as py

Dataset = pd.read_csv('filename.csv', sep = ',')

table = Dataset.find(id='AlerId')

rows = table.find_all('tr')[1:]

data = [[cell.text for cell in row.find_all('td')] for row in rows]

Dataset1 = pd.DataFrame(data=data, columns=columns)

X = Dataset1.iloc[:,:-1].values

Y = Dataset1.iloc[:,18].values

from sklearn.preprocessing import LabelEncoder, OneHotEncoder

labelencoder_X = LabelEncoder()

X[:, 0] = labelencoder_X.fit_transform(X[:, 0])

onehotencoder = OneHotEncoder(categorical_features = [0])

X = onehotencoder.fit_transform(X).toarray()

However, I think this might have just confused me more and now am in even more of a state.

Any suggestions?

3 Answers

+6 votes
by (33.1k points)
edited by

The code you shared above, shows that you misunderstood iloc function. The value before the colon(:) is the index of rows and the after ‘:’ represents the index of columns.

As stated above by you, You have less than 19 features in your dataset. But you are passing the index value of the 19th feature. That’s why this error is there. 

You should change the value according to the number of features in your datasets. 

This code in which error is caused by:

Y = Dataset.iloc[:,18].values

Hope this answer helps.

If you want to know more about Machine Learning then watch this video:

If you wish to learn more about Machine Learning, visit Machine Learning Tutorial and Machine Learning Course by Intellipaat.

by (19.7k points)
Thanks for the answer!
by (19.9k points)
This worked for me. Thank you so much.
by (19k points)
Thanks for this well-detailed answer!
by (47.2k points)
Indexing is out of bounds here most probably because there are less than 19 columns in your Dataset
+1 vote
by (44.3k points)

According to your code, you are not using the Y variable anywhere. 

Also, as @anurag said, you are mentioning 18 which actually means 19 features starting from 0. So, just comment out the line as you are not using the variable itself and should work.

by (29.3k points)
This works for me thanks
+3 votes
by (108k points)

Well said by @Anurag and @kodee as this error is caused by:

Y = Dataset.iloc[:,18].values

Indexing is out of bounds here most probably because there are less than 19 columns in your Dataset, so column 18 does not exist. The following code you provided doesn't use Y at all, so you can just comment out this line for now.

by (29.8k points)
I agree with you!
Welcome to Intellipaat Community. Get your technical queries answered by top developers!

28.4k questions

29.7k answers

500 comments

94.2k users

Browse Categories

...