Skip to content

Latest commit

 

History

History
12 lines (7 loc) · 1.02 KB

File metadata and controls

12 lines (7 loc) · 1.02 KB

Sentiment-Analysis-of-hotel-reviews

PROBLEM STATEMENT

In this project work, the problem lies in detection of sentiment of a given input text corpus. The problem is to calculate the accuracy of proposed algorithm against the large hotel review dataset. In the end we calculate the accuracy test data with the trained classification model. We classify all the given hotel reviews based on the ratings given by the customers. At the end, the graphs were plotted to demonstrate the obtained outputs.

Dataset

The dataset for the Hotel Review is collected and loaded from the below link: https://www.kaggle.com/jiashenliu/515k-hotel-reviews-data-in-europe The dataset consists of 17 features (attributes) and each feature has 515378 samples. It is stored in a tabular form. The rows being the samples and the columns being the features. We first start by loading the raw data. Each textual review is splitted into a positive and a negative part. We group them together in order to start with the raw data and no other information.