A Term Weighted Neural Language Model and Stacked Bidirectional LSTM Based Framework for Sarcasm Identification

Onan A.; Tocoglu M.A.

A Term Weighted Neural Language Model and Stacked Bidirectional LSTM Based Framework for Sarcasm Identification

Date

2021

Authors

Onan A.

Tocoglu M.A.

Publisher

Institute of Electrical and Electronics Engineers Inc.

Abstract

Sarcasm identification on text documents is one of the most challenging tasks in natural language processing (NLP), has become an essential research direction, due to its prevalence on social media data. The purpose of our research is to present an effective sarcasm identification framework on social media data by pursuing the paradigms of neural language models and deep neural networks. To represent text documents, we introduce inverse gravity moment based term weighted word embedding model with trigrams. In this way, critical words/terms have higher values by keeping the word-ordering information. In our model, we present a three-layer stacked bidirectional long short-term memory architecture to identify sarcastic text documents. For the evaluation task, the presented framework has been evaluated on three-sarcasm identification corpus. In the empirical analysis, three neural language models (i.e., word2vec, fastText and GloVe), two unsupervised term weighting functions (i.e., term-frequency, and TF-IDF) and eight supervised term weighting functions (i.e., odds ratio, relevance frequency, balanced distributional concentration, inverse question frequency-question frequency-inverse category frequency, short text weighting, inverse gravity moment, regularized entropy and inverse false negative-true positive-inverse category frequency) have been evaluated. For sarcasm identification task, the presented model yields promising results with a classification accuracy of 95.30%. © 2013 IEEE.