Sarcasm identification on Twitter: A machine learning approach

No Thumbnail Available

Date

2017

Authors

Journal Title

Journal ISSN

Volume Title

Abstract

In recent years, the remarkable growth in social media and microblogging platforms provide an essential source of information to identify subjective information of people, such as opinions, sentiments and attitudes. Sentiment analysis is the process of identifying subjective information from source materials towards an entity. Much of the social content online contain nonliteral language, such as irony and sarcasm, which may degrade the performance of sentiment classification schemes. In sarcastic text, the expressed text utterances and the intention of the person employing sarcasm can be completely opposite. In this paper, we present a machine learning approach to sarcasm identification. In this scheme, we utilized lexical, pragmatic, dictionary based and part of speech features. We employed two kinds of features to describe lexical information: unigrams and bigrams. In addition, term-frequency, term-presence and TF-IDF based representations are evaluated. To evaluate predictive performance of different representation schemes, Naïve Bayes, support vector machines, logistic regression and k-nearest neighbor classifiers are utilized. © Springer International Publishing AG 2017.

Description

Citation