Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Machine Learning by (19k points)

I want to create a dataset that has the same format as the cifar-10 data set to use with Tensorflow. It should have images and labels. Basically, I'd like to be able to take the cifar-10 code but different images and labels, and run that code. I haven't found any information on how to do this online, and am completely new to machine learning.

1 Answer

0 votes
by (33.1k points)

The following code would help you to solve your problem:

from PIL import Image

import numpy as np

im = Image.open('images.jpeg')

im = (np.array(im))

r = im[:,:,0].flatten()

g = im[:,:,1].flatten()

b = im[:,:,2].flatten()

label = [1]

out = np.array(list(label) + list(r) + list(g) + list(b),np.uint8)

out.tofile("out.bin")

This code syntax will convert an image into a byte file that is ready for use in CIFAR10. For multiple images, just keep concatenating the arrays, as stated in the format above. You should get a file size of 427*427*3 + 1 = 546988 bytes. Assuming your pictures are RGB and values range from 0-255. Check the run in TensorFlow. 

Study Tensorflow Tutorial and Machine Learning Course for a better viewpoint on the problem.

Hope this answer helps you!

Browse Categories

...