Analysis of data pre-processing methods for the sentiment analysis of reviews




Data pre-processing, feature selection, sentiment analysis, text classification


The aim of this study is to analyse the effects of data pre-processing methods for sentiment analysis and determine which of these pre-processing methods and their combinations are effective for English and an agglutinative language like Turkish. We also try to answer the research question “is there any difference between agglutinative and non-agglutinative languages in terms of pre-processing methods for sentiment analysis?” We find that the performance results for the English reviews are generally higher than for the Turkish reviews related to the differences between the two languages in terms of vocabularies, writing styles, and agglutinative property of the Turkish language.


Download data is not yet available.


