Open Journal of Statistics

Volume 12, Issue 3 (June 2022)

ISSN Print: 2161-718X   ISSN Online: 2161-7198

Google-based Impact Factor: 0.53  Citations  

A Statistical Analysis of Textual E-Commerce Reviews Using Tree-Based Methods

HTML  XML Download Download as PDF (Size: 2466KB)  PP. 357-372  
DOI: 10.4236/ojs.2022.123023    213 Downloads   1,068 Views  Citations

ABSTRACT

With the increasing interest in e-commerce shopping, customer reviews have become one of the most important elements that determine customer satisfaction regarding products. This demonstrates the importance of working with Text Mining. This study is based on The Womens Clothing E-Commerce Reviews database, which consists of reviews written by real customers. The aim of this paper is to conduct a Text Mining approach on a set of customer reviews. Each review was classified as either a positive or negative review by employing a classification method. Four tree-based methods were applied to solve the classification problem, namely Classification Tree, Random Forest, Gradient Boosting and XGBoost. The dataset was categorized into training and test sets. The results indicate that the Random Forest method displays an overfitting, XGBoost displays an overfitting if the number of trees is too high, Classification Tree is good at detecting negative reviews and bad at detecting positive reviews and the Gradient Boosting shows stable values and quality measures above 77% for the test dataset. A consensus between the applied methods is noted for important classification terms.

Share and Cite:

Kubrusly, J. , Neves, A. and Marques, T. (2022) A Statistical Analysis of Textual E-Commerce Reviews Using Tree-Based Methods. Open Journal of Statistics, 12, 357-372. doi: 10.4236/ojs.2022.123023.

Cited by

No relevant information.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.