# A simple explanation of Naive Bayes Classification

+1 vote
4 views

Can someone explain the process of Naive Bayes in simple english? How is the training data related to the actual dataset? Explain with an example.

by (10.9k points)

To understand Naive Bayes classification we first need to understand Bayes Theorem.

Bayes Theorem works on the conditional probability.Conditional probability it is the probability of occurrence of something,given that something else has already occurred.

Naive Bayes is a kind of classifier which implements Bayes Theorem.

Naive Bayes predicts membership probabilities for each class such as the probability of a given record or data point that belongs to a particular class. The class which has the highest probability is considered the most likely class. This is also known as Maximum A Posteriori (MAP).

Naive Bayes classifier assumes that all the features are unrelated to each other, so the absence or presence of a feature does not influence the presence or absence of another feature.

Formula-

P(H|E)=(P(E|H) * P(H))/P(E)

Where ,

P(H) is the prior probability.

P(E) is the probability of the evidence(regardless of the hypothesis).

P(E|H) is the probability of the evidence given the hypothesis is true.

P(H|E) is the probability of the hypothesis given that the evidence is there.

Example-

Let us assume we have data of 1000 fruits from which some are banana, orange, and some other fruit, each fruit has classified using three characteristics:

• round
• Sweet
• red

Training set:

Type          Round | Not Round || Sweet | Not Sweet || Red|Not Red|Total

______________________________________________________

Apple           |  400  |    100   || 250   |    150    ||  450   |  50      |  500

Banana       |    0      |    300   || 150   |    150    ||  300   |   0      |  200

other fruit     |  100  |    100   || 150   |     50    ||   150   | 50      |  300

______________________________________________________

Total            |  500  |    500   || 550   |    350    ||  900   | 100      | 1000

Step 1: finding the ‘prior’ probabilities for each class of fruits.

P(Apple) = 500 / 1000 = 0.50

P(Banana) = 200 / 1000 = 0.20

P(other fruit) = 300 / 1000 = 0.30

Step 2: finding the probability of evidence

p(round)   = 0.5

P(Sweet)  = 0.65

P(red) = 0.8

Step 3: finding the probability of likelihood of evidences :

P(round|Apple) = 0.8

P(round|Banana) = 0

P(Red|Other Fruit)     =  150/300 = 0.75

P(Not red|Other Fruit) = 50/300 =0.25

Step 4: Putting the values in equation:

P(Apple|Round, Sweet, and Red)

= P(Round|Apple) * P(Sweet|Apple) * P(Red|Apple) * P(Apple)/P(Round) * P(Sweet) * P(Red)

= 0.8 * 0.5 * 0.9 * 0.5 / P(evidence)

= 0.18 / P(evidence)

P(Banana|Round, Sweet and Red) = 0

P(Other Fruit|Round, Sweet and Red)  =  ( P(Round|Other fruit) * P(Sweet|Other fruit) * P(Red|Other fruit) * P(Other Fruit) ) /P(evidence)

= (100/300 * 150/300 * 150/300 * 300/1000) / P(evidence)

=( 0.33*0.5 * 0.5 * 0.3) / P(evidence)

=(0.02475)/P(evidence)

Through this example we classify that this round,sweet and red colour fruit is likely to be an Apple.

Since Bayes' theorem forms a part of Machine Learning Tutorial, learning it will somewhat help in mastering Machine Learning Course by Intellipaat.

+1 vote