Explore Courses Blog Tutorials Interview Questions
0 votes
1 view
in Python by (12.7k points)

I saw that with sklearn we can utilize some pre-defined datasets, for instance mydataset = datasets.load_digits() we will get an array of the dataset and an array of the comparing marks Anyway, I need to stack my own dataset to have the option to utilize it with sklearn. How and in which organization should I load my information? My document has the accompanying configuration.







1 Answer

0 votes
by (26.4k points)

You can utilize NumPy's genfromtxt capacity (function) to recover information from the file 

import numpy as np

mydata = np.genfromtxt(filename, delimiter=",")

Notwithstanding, in the event that you have printed sections, utilizing genfromtxt is trickier, since you need to determine the data types. 

It will be a lot simpler with the superb Pandas library

import pandas as pd

mydata = pd.read_csv(filename)

target = mydata["Label"]  #provided your csv has header row, and the label column is named "Label"

#select all but the last column as data

data = mydata.ix[:,:-1]

Looking for a good python tutorial course? Join the python certification course and get certified.

Browse Categories