Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Arabic sentence-level sentiment analysis

Sentiment analysis has recently become one of the growing areas of research related to text mining and natural language processing. The increasing availability of online resources and popularity of rich and fast resources for opinion sharing like news, online review sites and personal blogs, caused...

Full description

Saved in:
Bibliographic Details
Main Author: Shoukry, Amira Magdy
Format: Thesis
Published: AUC Knowledge Fountain 2013
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Sentiment analysis has recently become one of the growing areas of research related to text mining and natural language processing. The increasing availability of online resources and popularity of rich and fast resources for opinion sharing like news, online review sites and personal blogs, caused several parties such as customers, companies, and governments to start analyzing and exploring these opinions. The main task of sentiment classification is to classify a sentence (i.e. review, blog, comment, news, etc.) as holding an overall positive, negative or neutral sentiment. Most of the current studies related to this topic focus mainly on English texts with very limited resources available for other languages like Arabic, especially for the Egyptian dialect. In this research work, we would like to improve the performance measures of Egyptian dialect sentence-level sentiment analysis by proposing a hybrid approach which combines both the machine learning approach using support vector machines and the semantic orientation approach. Two methodologies were proposed, one for each approach, which were then joined, creating the hybrid proposed approach. The corpus used contains more than 20,000 Egyptian dialect tweets collected from Twitter, from which 4800 manually annotated tweets will be used (1600 positive tweets, 1600 negative tweets and 1600 neutral tweets). We performed several experiments to: 1) compare the results of each approach individually with regards to our case which is dealing with the Egyptian dialect before and after preprocessing; 2) compare the performance of merging both approaches together generating the hybrid approach against the performance of each approach separately; and 3) evaluate the effectiveness of considering negation on the performance of the hybrid approach. The results obtained show significant improvements in terms of the accuracy, precision, recall and F-measure, indicating that our proposed hybrid approach is effective in sentence-level sentiment classification. Also, the results are very promising which encourages continuing in this line of research.