The Theory Behind Naive Bayes | NLP Python Classification Example

Zahra Ahmad
5 min readJan 7, 2022
Photo by Peter van der Meulen on Unsplash

Introduction

Naive Bayes is a great supervised machine learning approach for multiple tasks as classification.

It is proven that NB can deliver great results and the best thing about it is that its simplicity, it is easy to train and does not require super computational resources compared to Support Vector Machine or Embeddings as in BERT.

Taken into considerations its simplicity and efficiency, it is very important learn it and add it to your ML portfolio, whenever you have a classification task, start with it, if you do not get nice results, think about something more complicated.

Some people get confused when they read about Naive Bayes classifier because they want actually to learn about Naive Bayes or Gaussian Naive Bayes which is different and is not covered in this article, check [1] for more.

What is Naive Bayes?

Naive Bayes is based on Bayes theory or Bayes’ Rule or Bayes’ Law. The formula for the rule is as follows:

Note that P(y) is the class that we are predicting. Our goal is to estimate P(y=c) meaning that given a data instance (set of features), what is the probability that the class (y) is equal to (c ).

--

--

Zahra Ahmad

MSc in Data Science, I love to extract the hell out of any raw data, sexy plots and figures are my coffee