Tackling Class Imbalance: scaling contribution to loss and sgd

Question

asked Jul 11, 2019 in Machine Learning by ParasSharma1 (19k points)

I have incorporated the InfogainLossLayer as suggested by Shai. I've also added another custom layer that builds the infogain matrix H based on the imbalance in the current batch.

Currently, the matrix is configured as follows:

H(i, j) = 0 if i != j
H(i, j) = 1 - f(i) if i == j (with f(i) = the frequency of class i in the batch)

I'm planning on experimenting with different configurations for the matrix in the future.

I have tested this on a 10:1 imbalance. The results have shown that the network is learning useful things now: (results after 30 epochs)

Accuracy is approx. ~70% (down from ~97%);

Precision is approx. ~20% (up from 0%);

Recall is approx. ~60% (up from 0%).

These numbers were reached at around 20 epochs and didn't change significantly after that.

!! The results stated above are merely a proof of concept, they were obtained by training a simple network on a 10:1 imbalanced dataset. !!

1 Answer

Anurag · Answer 1 · 2019-07-12T06:19:48+0000

You can use the Infogain Loss layer to compensate for the imbalance in your training set.

For example:

[cost of predicting 1 when gt is 0, cost of predicting 0 when gt is 0
cost of predicting 1 when gt is 1, cost of predicting 0 when gt is 1]

It will help you to set the entries of H to reflect the difference between errors in predicting 0 or 1.

Hope this answer helps.

Tackling Class Imbalance: scaling contribution to loss and sgd

1 Answer

Related questions

Browse Categories

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources