Open Access Library Journal

Volume 8, Issue 6 (June 2021)

ISSN Print: 2333-9705   ISSN Online: 2333-9721

Google-based Impact Factor: 1.18  Citations  

Sentiment Analysis on Social Media for Albanian Language

HTML  XML Download Download as PDF (Size: 2028KB)  PP. 1-31  
DOI: 10.4236/oalib.1107514    437 Downloads   2,266 Views  Citations

ABSTRACT

The recent advances in technology and particularly, the rising prominence of social media platforms have made it possible to express our emotions through electronic means, which have led to the creation of large collections of unstructured textual documents. These collections can be saved and potentially studied with many modern technologies like Text Mining, Machine Learning and Natural Language Processing to obtain new knowledge from them. Sentiment Analysis is a field of Natural Language Processing that focuses on extracting sentiment from text. Moreover, as a Text Mining technique expresses the ability to track the subjective opinion of a text produced by an entity. The purpose of this paper is to test and review different approaches in Sentiment Analysis for messages in the Albanian language found on Twitter. Additionally, we compare the results among different methods and note the challenges that arise while finally we suggest future directions for further research. This paper’s research was conducted as follows: the data was pre-processed, before being converted from text to vector representation using a range of feature extraction techniques such as Bag-of-Words, TF-IDF, Word2Vec, and Glove. We study the performance of sentiment classification techniques from three main approaches: traditional machine learning, lexicon-based and deep learning approach. For model evaluation, since they were trained in unbalanced data, we used not only classical evaluation criteria such as Accuracy, Specificity, Precision, and Recall but more appropriate criteria such as F-measure, Balanced Accuracy, and Matthews Correlation Coefficient (MCC). According to all these criteria, our experiments revealed that LSTM based RNN with Glove as a feature extraction technique provides the best results with F-score = 87.8%, followed by Logistic Regression.

Share and Cite:

Vasili, R. , Xhina, E. , Ninka, I. and Terpo, D. (2021) Sentiment Analysis on Social Media for Albanian Language. Open Access Library Journal, 8, 1-31. doi: 10.4236/oalib.1107514.

Copyright © 2025 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.