Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Machine Learning by (19k points)

On Caffe, I am trying to implement a Fully Convolution Network for semantic segmentation. I was wondering is there a specific strategy to set up your 'solver.prototxt' values for the following hyper-parameters:

  • test_iter
  • test_interval
  • iter_size
  • max_iter

Does it depend on the number of images you have for your training set? If so, how?

1 Answer

0 votes
by (33.1k points)

To set these values in a meaningful manner, you need to have a few more bits of information regarding your data:

1. The training set size should be the total number of training examples in your dataset.

2. Training batch size is the number of training examples processed together in a single batch, this is usually set by the input data layer in the 'train_val.prototxt'. For example, in this file, the train batch size is set to 256. Let's denote this quantity by tb.

3. The validation set size s the total number of examples you set aside for validating your model, just denote this by V.

4. Validation batch size value set in batch_size for the TEST phase. In this example it is set to 50. Let's call this vb.

Here you would like to get an unbiased estimate of the performance of your net every once in a while. To do so you should run your net on the validation set for test_iter iterations. To cover the entire validation set you need to have test_iter = V/vb.

To cover the entire training set (completing an "epoch") you need to run T/tb iterations. Usually, one trains for several epochs, thus 

max_iter=#epochs*T/tb

Hope this answer helps you! Studying the various datasets for Machine Learning will make concepts clear for an individual.

Browse Categories

...