2 views

Consider the following training data sets.

+-------+-------+----------+-------------+

| Size  | Color | Shape    | Class/Label |

+=======+=======+==========+=============+

| big   | red | circle   | No |

| small | red   | triangle | No         |

| small | red   | circle | Yes         |

| big   | blue | circle   | No |

| small | blue  | circle | Yes         |

+-------+-------+----------+-------------+

I would like to understand how the algorithm proceeds when it starts with a negative example and when two negative examples come together.

This is not an assignment question by the way.

Examples with other data sets are also welcome! This is to understand the negative part of the algorithm.

+1 vote
by (33.1k points)

G0 = {<?, ?, ?>}

S0 = {<0, 0, 0>}

If you take a negative example, you need to remove them from any hypothesis incompatible with the current observation, then replace any inconsistent hypothesis in G with its minimal specializations that are consistent with the observation but still more general than some member of the hypothesis.

So for your first (negative) example, (big, red, circle), the minimal specializations would make a new hypothesis space

G1 = {<small, ? , ?>, <?, blue, ?>, <?, ?, triangle>}

S1 = S0 = {<0, 0, 0>}

Here S is unchanged. In the next example, (small, red, triangle), which is also negative, you will need another G. Here the second hypothesis in G1 didn’t match the new observation so only the first and third hypotheses in G1 need to be specialized. That would yield

G2 = {<small, blue, ?>, <small, ?, circle>, <?, blue, ?>, <big, ?, triangle>, <?, blue, triangle>}

Here the first and last hypotheses in G2 above are specializations of the middle hypothesis (<?, blue, ?>), we would drop those two:

G2 = {<small, ?, circle>, <?, blue, ?>, <big, ?, triangle>}

S2 = S1 = S0 = {<0, 0, 0>}

For the positive (small, red, circle) observation, you should generalize S and remove anything in G that is inconsistent:

G3 = {<small, ?, circle>}

S3 = {<small, red, circle>}

(big, blue, circle) is the next negative example. But since it is not consistent, then there is nothing to do so

G4 = G3 = {<small, ?, circle>}

S4 = S3 = {<small, red, circle>}

Now you would have the positive example of (small, blue, circle), which requires you to generalize S to make it consistent

G5 = {<small, ?, circle>}

S5 = {<small, ?, circle>}

Since G and S are equal, now you have learned the concept of "small circles".