For the text classification problems the first challenge would be cleaning our data and convert that in a format which can be easily understood by the computer. Consider we have to find a genre of a book or a movie based on it’s content, the first thing we have to do is preparing the training dataset. Here we have two approaches to do that, one is a simple bag of words method and the other one is Doc2Vec. Let’s explore both the methods for predicting the movie genre based on it’s subtitle.