posted on 2018-02-21, 23:23authored byJoshi Aditya Madhav Manda
Sarcasm is
verbal irony that is intended to mock or ridicule. Existing sentiment analysis
systems show a degraded performance in case of sarcastic text. Hence, computational
sarcasm has received attention from the sentiment analysis community.
Computational sarcasm refers to computational techniques that deal with
sarcastic text. This thesis presents our investigations in computational
sarcasm based on the linguistic notion of incongruity. For example, the
sentence `I love being ignored' is sarcastic because the positive word `love'
is incongruous with the negative phrase `being ignored'. These investigations
are divided into three parts: understanding the phenomenon of sarcasm, sarcasm
detection and sarcasm generation.
To first understand the phenomenon of sarcasm, we consider
two components of sarcasm: implied negative sentiment, and presence of a
target. To understand how implied negative sentiment plays a role in sarcasm
understanding, we present an annotation study which evaluates the quality of a
sarcasm-labeled dataset created by non-native annotators. Following this, in
order to show how the target of sarcasm is important to understand sarcasm, we
first describe an annotation study which highlights the challenges in
distinguishing between sarcasm and irony (since irony does not have a target
while sarcasm does), and then present a computational approach that extracts
the target of a sarcastic text.
We then present our approaches for sarcasm detection. To
detect sarcasm, we capture incongruity in two ways: `intra-textual incongruity'
where we look at the incongruity within the text to be classified (i.e., target
text), and the `context incongruity' where we incorporate information outside
the target text. To detect incongruity within the target text, we present four
approaches: (a) A classifier that captures sentiment incongruity using
sentiment-based features (as in the case of `I love being ignored'), (b) A
classifier that captures semantic incongruity (as in the case of `A woman needs
a man like a fish needs bicycle') using word embedding-based features, (c) A
topic model that captures sentiment incongruity using sentiment distributions
in the text (in order to discover sarcasm-prevalent topics such as work,
college, etc.), and (d) An approach that captures incongruity in the language
model using sentence completion. The approaches in (a) and (c) incorporate
sentiment incongruity relying on sentiment-bearing words, whereas approach in
(b) and (d) tackle other forms of incongruity where sentiment-bearing words may
not be present.
On the other hand, to detect sarcasm using contextual
incongruity, we describe two approaches: (a) A rule-based approach that uses
historical text by an author to detect sarcasm in the text generated by them,
and (b) A statistical approach that uses sequence labeling techniques for
sarcasm detection in dialogue. The approach in (a) attempts to detect sarcasm
that requires author-specific context while that in (b) attempts to detect
sarcasm that requires conversation-specific context. Finally, we present an
technique for sarcasm generation. In this case, we use a template-based
approach to synthesize incongruity and generate a sarcastic response to user
input.
Our investigations demonstrate how evidences of incongruity
(such as sentiment incongruity, semantic incongruity, etc.) can be modeled
using different learning techniques (such as classifiers, topic models, etc.)
for sarcasm detection and sarcasm generation. In addition, our findings
establish the promise of novel problems like sarcasm target identification and
sarcasm versus irony classification, and provide insights for future research
in sarcasm detection.
History
Campus location
Australia
Principal supervisor
Mark Carman
Additional supervisor 1
Pushpak Bhattacharyya
Year of Award
2018
Department, School or Centre
Information Technology (Monash University Caulfield)
Additional Institution or Organisation
Indian Institute of Technology Bombay, India (IITB)