One of the simplest classification algorithms in use for predictive analytics is the Naive Bayes technique. The technique is based on the principle of conditional probability. Despite the name, the technique is very robust and works on par with more advanced classification schemes for many types of data. There are two things to keep in mind while using Naive Bayes:

- Works only with categorical predictors, numerical predictors must be categorized or binned before use
- Works with the assumption of predictor independence, and thus cannot detect or account for relationships between the predictors, unlike a decision tree for example.

In this article we dig deep into fundamentals behind Naive Bayes. In a future article we will demonstrate a real world application of the technique using RapidMiner.

Suppose there are two machines, A and B in a factory which can both produce a part, X. They also produce another part Y. Machine A produces 70% of all X's and B produces 30% of all X's. Machine A has a 5% chance of producing a defective part and Machine B has a 10% chance of producing a defective part. If we randomly pick a produced part X from the inventory and find out that it has a defect, what is the probability that this part was produced by Machine A?

The classification problem here is to determine if a part (X) is produced by A or B. The predictor or "independent" variable is the Defect (Y/N).

**The Naive Rule:**

We could simply say that because Machine A produces a majority of the parts, then the randomly picked part is probably made by A. Thus we have a "naive" classification rule based on *simple majority*.

**Bayes Rule:**

The Bayes rule fine tunes this logic by using additional information available from a "predictor": Defect. The following graphic helps you understand the flow.

As you can see, using Bayes rule, the simplistic classification on the basis of majority class membership is fine tuned by leveraging new information about "Defect" variable. Note that the probability of finding a defective part made by B is increased from the naive rule, in accordance with the higher defect rate of B.

Advanced analytics techniques using Bayes rule are termed Naive Bayes algorithms and can accept multiple predictors and build a fairly robust classification using probability information from the different independent variables. The key here is that each variable is treated independently and no interaction between them is assumed. We will explore this in detail in a next article.

*If you like such tutorials, consider signing up for visTASC our FREE online analytics portal that helps you choose the right technique for your analytics problems.*