SENTIMENT ANALYSIS ON IMDB

INTRODUCTION:
Sentiment analysis is the process of using natural language processing, text analysis, and statistics to analyze customer sentiment. The best businesses understand the sentiment of their customers—what people are saying, how they’re saying it, and what they mean. Customer sentiment can be found in tweets, comments, reviews, or other places where people mention your brand. Sentiment Analysis is the domain of understanding these emotions with software, and it’s a must-understand for developers and business leaders in a modern workplace.As with many other fields, advances in deep learning have brought sentiment analysis into the foreground of cutting-edge algorithms. Today we use natural language processing, statistics, and text analysis to extract, and identify the sentiment of words into positive, negative, or neutral categories.
This project is to analyze the sentiment given by a person who are watching movies. For this we need a dataset which is provided by kaggle or http://ai.stanford.edu/~amaas/data/sentiment/.
This datasets contains 50,000 movie reviews that have been pre-labeled with “positive” and “negative” or 0 or 1 sentiment class labels based on the review content.Negative reviews have scores less or equal than 4 out of 10 while a positive review have score greater or equal than 7 out of 10. Neutral reviews are not included. The 50,000 reviews are divided evenly into the training and test set. Besides this, there are additional movie reviews that are unlabeled. We will only be using the raw labeled movie reviews for our analyses.
This datasets contains 50,000 movie reviews that have been pre-labeled with “positive” and “negative” or 0 or 1 sentiment class labels based on the review content.Negative reviews have scores less or equal than 4 out of 10 while a positive review have score greater or equal than 7 out of 10. Neutral reviews are not included. The 50,000 reviews are divided evenly into the training and test set. Besides this, there are additional movie reviews that are unlabeled. We will only be using the raw labeled movie reviews for our analyses.
DATA PREPROCESSING:
Following steps were done for text preprocessing:
- Remove punctuation
- Tokenize sentence
- Remove stopwords
- Lemmatize words
- Calculate TFIDF vectorizer
- Train ML models
LEMMATIZE WORDS:
Lemmatization change words based on the dictionary from different algorithms, such as "went" to "go". Based on the different type of the word (verb, noun), it can change to different meaning of word which solve the disambiguation problem. While it demands more computational power. (It can be used if you want to build a dictionary world: NLP system)
TRAIN ML MODELS:
In this project we have used two patterns to test it:
1) Supervised Learning
2) Unsupervised Learning
The models used in Supervised learning are:
1. Logistic Regression
2. Stochastic gradient descent
3.Random Forest Classifier
4.Ada Boost Classifier.
The models used in Unsupervised Learning are Lexicon based:
1. AFINN Lexicon
2. VADER Lexicon
COMPARING THE MODELS:
- Logistic Regression

- Stochastic gradient descent

- Random Forest Classifier

- Ada Boost Classifier

- AFINN Lexicon
- VADER Lexicon

CONCLUSION:
Congratulations Atharva for a very studious and elaborative blog, you do have a style n expression for your thoughts. All the Best!
ReplyDeleteThanks Nitin Arekar sir your words have encouraged me a lot....
DeleteCongrats Atharva & all the best for your future plans ๐๐
ReplyDeleteThanks a lot
DeleteCongratulations Atharva
ReplyDeleteYour write up is informative. It is in very simple format so that lay man can also understand it. All the best for your future
Thanks a lot
DeleteGreat work Atharva. Keep it up ๐
ReplyDeleteCongratulations Atharva . Keep it up.
ReplyDeleteThanks a lot
DeleteCongrats atharv..All the best for your future plan
ReplyDeleteThank you so much
DeleteVery good Atharva...very simplified Analysis & good amount of efforts...keep going My best wishes always.
ReplyDeleteThank you so much
DeleteCongratulations Atharva
ReplyDeleteYour write up is informative. It is in very simple format and nicely written.
All the best for your future endovour
Thanks Sukhada mam....
DeleteIt gives me immense pleasure and encouragement....
Congrats Atharva. Well done ๐
ReplyDeleteThanks a lot
DeleteThat's a deep analysis brother !! Keep up the good work. Well Done ๐
ReplyDeleteThanks Jay
DeleteVery good. Well done๐
ReplyDeleteThanks a lot
DeleteGood work atharva!! Congrats๐๐
ReplyDeleteThanks a lot
DeleteGood work atharva๐๐
ReplyDeleteCongrats๐๐
--renuka mam
Thanks Renuka mam
DeleteGreat work!! Very nicely put forward. Appreciate the great amount of efforts you've put into this. Kudos๐๐
ReplyDeleteGreat work Aatharav, knowledgeable and insightful information. This will indeed help students who want to pursue their career in this field.
ReplyDeleteGreat work Aatharav, knowledgeable and insightful information. This will indeed help students who want to pursue their career in this field.
ReplyDelete