Satire Detection in Turkish News Articles: A Machine Learning Approach

No Thumbnail Available

Date

2019

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

With the advances in information and communication technologies, an immense amount of information has been shared on social media and microblogging platforms. Much of the online content contains elements of figurative language, such as, irony, sarcasm and satire. The automatic identification of figurative language can be viewed as a challenging task in natural language processing, where linguistic entities, such as, metaphor, analogy, ambiguity, irony, sarcasm, satire, and so on, have been utilized to express more complex meanings. The predictive performance of sentiment classification schemes may degrade if figurative language within the text has not been properly addressed. Satirical text is a way of figurative communication, where ideas/opinions regarding a people, event or issue is expressed in a humorous way to criticize that entity. Satirical news can be deceptive and harmful. In this paper, we present a machine learning based approach to satire detection in Turkish news articles. In the presented scheme, we utilized three kinds of features to model lexical information, namely, unigrams, bigrams and tri-grams. In addition, term-frequency, term-presence and TF-IDF based schemes have been taken into consideration. In the classification phase, Naïve Bayes, support vector machines, logistic regression and C4.5 algorithms have been examined. © 2019, Springer Nature Switzerland AG.

Description

Citation