Sarcasm identification on Twitter: A machine learning approach
No Thumbnail Available
Date
2017
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
In recent years, the remarkable growth in social media and microblogging platforms provide an essential source of information to identify subjective information of people, such as opinions, sentiments and attitudes. Sentiment analysis is the process of identifying subjective information from source materials towards an entity. Much of the social content online contain nonliteral language, such as irony and sarcasm, which may degrade the performance of sentiment classification schemes. In sarcastic text, the expressed text utterances and the intention of the person employing sarcasm can be completely opposite. In this paper, we present a machine learning approach to sarcasm identification. In this scheme, we utilized lexical, pragmatic, dictionary based and part of speech features. We employed two kinds of features to describe lexical information: unigrams and bigrams. In addition, term-frequency, term-presence and TF-IDF based representations are evaluated. To evaluate predictive performance of different representation schemes, Naïve Bayes, support vector machines, logistic regression and k-nearest neighbor classifiers are utilized. © Springer International Publishing AG 2017.
Description
Keywords
Artificial intelligence , Intelligent systems , Nearest neighbor search , Social networking (online) , Text processing , K-nearest neighbor classifier , Machine learning approaches , Micro-blogging platforms , Predictive performance , Representation schemes , Sentiment classification , Subjective information , Twitter , Learning systems