Sentiment Analysis: An Assessment of Diverse Methods
Keywords:
Sentiment Analysis; Mobile Phone Reviews; TF-IDF; Multinomial Naïve Bayes; Support Vector Machine; Logistic Regression; Decision Tree; Random Forest Classifier.Abstract
Digitalization over the years has greatly impacted the inevitability of consumer reviews in the online sphere. Analysing a review given to a product has always been a crucial need, and these reviews are very vital as they shape the overall product, thereby allowing the customer to gain hindsight about the product that they might intend to buy. But a single product can itself have a colossal number of reviews, and thus it becomes very difficult at times for the customer to choose a product. Therefore, if there is a suitable mechanism that can help the buyer and seller to analyze the products, then it can greatly solve the problem of decidability. Hence, we have carried out this research in which we compared five machine learning classifiers: Multinomial Naïve Bayes, Support Vector Machine, Logistic Regression, Decision Trees, and Random Forest Classifier, on the Amazon phone reviews. We utilized the feature extraction technique of TF-IDF to convert the textual data into numerical form and used evaluation metrics such as precision, recall, f1-score, and accuracy to assess our models. Our evaluation and analysis show that a Random Forest gives the best possible suitable result for the chosen data; this was additionally evaluated by tuning the hyperparameters of the Random Forest using out-of-bag error and 3-fold cross-validation techniques, and it showcases an improvement in accuracy with the former method.