I'm looking at implementing a Caffe CNN which accepts two input images and a label (later perhaps other data) and was wondering if anyone was aware of the correct syntax in the prototxt file for doing this? Is it simply an IMAGE_DATA layer with additional tops? Or should I use separate IMAGE_DATA layers for each?

You should use the HDF5_DATA layer:

The HDF5 is a key-value store, where each key is a string, and each value is a multi-dimensional array. To use the HDF5_DATA layer, you want to use, and set the value for that key to store the image you want to use. Writing those HDF5 files from python is easy:

import h5py

import numpy as np

filelist = []

for i in range(100):

    image1 = get_some_image(i)

    image2 = get_another_image(i)

    filename = '/tmp/my_hdf5%d.h5' % i

    with hypy.File(filename, 'w') as f:

        f['data1'] = np.transpose(image1, (2, 0, 1))

        f['data2'] = np.transpose(image2, (2, 0, 1))


with open('/tmp/filelist.txt', 'w') as f:

    for filename in filelist:

        f.write(filename + '\n')

Here you should simply set the source of the HDF5_DATA param to be '/tmp/filelist.txt', and set the tops to be "data1" and "data2". For more details, study the Machine Learning Course.

