Sarcasm SIGN dataset, a parallel corpus of sarcastic tweets and their non-sarcastic interpretations, as created by human experts. This corpus was created as part of our paper Sarcasm SIGN: Interpreting Sarcasm with Sentiment Based Monolingual Machine Translation which will be presented in ACL 2017. The repository contains two folders: “corpus” which contains the data files as well as the instructions for our human experts; and “preprocess” which contains code for preprocessing the data and preparing it for a MT system (see ReadMe in preprocess folder).
