Information Gain and Entropy

Question

asked Jun 19, 2019 in AI and Deep Learning by ashely (50.2k points)

I recently read this question regarding information gain and entropy. I think I have a semi-decent grasp on the main idea, but I'm curious, what to do with situations such as follows:

If we have a bag of 7 coins, 1 of which is heavier than the others, and 1 of which is lighter than the others, and we know the heavier coin + the lighter coin is the same as 2 normal coins, what is the information gain associated with picking and weighing two random coins them against each other?

Our goal here is to identify the two odd coins. I've been thinking the problem over for a while, and can't frame it correctly in a decision tree, or any other way for that matter. Any help?

EDIT: I understand the formula for entropy and the formula for information gain. What I don't understand is how to frame this problem in a decision tree format.

EDIT 2: Here is where I'm at so far:

Assuming we pick two coins and they both end up weighing the same, we can assume our new chances of picking H+L to come out to 1/5 * 1/4 = 1/20, easy enough.

Assuming we pick two coins and the left side is heavier. There are three different cases where this can occur:

HM: Which gives us 1/2 chance of picking H and a 1/4 chance of picking L: 1/8 HL: 1/2 chance of picking high, 1/1 chance of picking low: 1/1 ML: 1/2 chance of picking low, 1/4 chance of picking high: 1/8

However, the odds of us picking HM are 1/7 * 5/6 which is 5/42

The odds of us picking HL are 1/7 * 1/6 which is 1/42

And the odds of us picking ML are 1/7 * 5/6 which is 5/42

If we weigh the overall probabilities with these odds, we are given:

(1/8) * (5/42) + (1/1) * (1/42) + (1/8) * (5/42) = 3/56.

The same holds true for option B.

option A = 3/56

option B = 3/56

option C = 1/20

However, option C should be weighted heavier because there is a 5/7 * 4/6 chance to pick two mediums. So I'm assuming from here I weight THOSE odds.

I am pretty sure I've messed up somewhere along the way, but I think I'm on the right path!

EDIT 3: More stuff.

Assuming the scale is unbalanced, the odds are (10/11) that only one of the coins is the H or L coin, and (1/11) that both coins are H/L

Therefore we can conclude:

(10 / 11) * (1/2 * 1/5) and

(1 / 11) * (1/2)

EDIT 4: Going to go ahead and say that it is a total of 4/42 increase.

1 Answer

vinita · Answer 1 · 2019-06-19T11:19:46+0000

Your question is interesting but before going into the depth of information gain let me explain that what basically entropy does:

Entropy controls how a Decision Tree decides to split the data. It actually affects how a Decision Tree draws its boundaries. Information gain (IG) helps us to measure how much “information” a feature gives us about the class.

IG=entropy(parent)-[weights average]*entropy(child)

Now coming to your problem, suppose the values of the coins H (for heavy), L (for light), and M (for medium). Here M is for the other 5 coins.

When you pick 2 coins at random you can get (out of 7 * 6 == 42 possibilities including order) HL, LH (one each), HM, MH, LM, ML (5 each), MM (5 * 4 == 20 cases) -- 2 + 20 + 20=42.

In the weighting you get 3 possible results, call them A (left one is heavier), B (right one is heavier), C (both have equal weight).

HL, HM, and ML, 11 cases, will be A;

LH, MH, and LM, 11 cases, will be B;

MM, 20 cases,will be C.

So A and B aren't really distinguishable (which one is left, which one is right), so we have 22 cases where the weight will be different, 20 where they will be equal -- it's a good sign that the cases giving each result are in pretty close numbers!

You are asked to pick the H and L choice. If you did it at random before the experiment, what would be your chances? 1 in 7 for the random pick of the H; given that succeeds 1 in 6 for the pick of the L -- overall 1 in 42.

In the case of C, you can rule out those two coins and you're left with a mystery H, a mystery L, and three Ms -- so if you picked at random you'd have 1 in 5 to pick H, if successful 1 in 4 to pick L, overall 1 in 20 -- your success chances have slightly more than doubled.

It's trickier to see "what next" for the A (and equivalently B) cases because they're several, as listed above (and, less obviously, not equiprobable...), but obviously you won't pick the known-lighter coin for H (and vice versa) and if you pick one of the 5 unweighted coins for H (or L) only one of the weighted coins is a candidate for the other role (L or H respectively).

If you want know about Artificial Intelligence and Deep Learning then you can watch this video:

Information Gain and Entropy

1 Answer

Related questions

Browse Categories

Browse By Domains

Popular Courses

Popular Tutorials

Popular Resources