Movie Review Data includes:
The MPQA Opinion Corpus is a database comprising five subsets of extensively (manually) annotated data. The annotations include:
Customer Reviews Datasets includes 2 sets of lightly tokenized reviews for five and nine products respectively. Product features immediately precede a positive/negative rating tag (e.g. [+3]). Additional metadata informs on feature absence, possible need for pronoun resolution and whether or not the opinionated sentence is a comparison or a suggestion.
Extracting Product Features and Opinions from Reviews (Popescu & Etzioni, 2005) brings some interesting ideas that intersect with SA and parsing. Rather than identifying windows for product features by counting k words surrounding a feature, they extract the dependencies of features to surrounding words. This allows them to use a set of domain-independent rules for the extraction of potential opinion phrases:
It is impossible to distinguish exactly what phrase-modifiers actually modify (think of ambiguous PP attachment), but that is exactly the kind of ambiguity syntactic parsing struggles with.
Some thoughts inspired by this stellar paper.
Classifying information based on sentiment and opinion is the aim of the ongoing research on Sentiment Analysis (SA) and Opinion Mining (OM). Treating concepts like ”good” and ”bad” as first class objects presents new challenges to the NLP community: on one hand, heuristics that proved successful for other feats of classification simply cannot be applied to this kind of information. On the other, the challenges of discerning opinion come bundled with those of processing user generated content.
There are several reasons to investigate the ”[…] computational treatment of opinion, sentiment and subjectivity”. Improved search accuracy and recommendation system are obvious candidates - users want to know how others feel about certain object or services. Corporations are eager to collect intelligence on how their products are perceived. Classifying information as ”hateful” might help governments preventing crime. Summarization can certainly benefit by isolating subjective paragraphs to either present or omit them. Classifying opinions as liberal or conservative allows for more accurate predictions on the outcomes of elections. Intelligent systems could boost their performance by adapting to the mood of the user.
One of the first to undertake this task was Peter D. Turney, as described in his article “Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews”.
He does it by using Pointwise Mutual Information (PMI) to calculate the Semantic Orientation (SO) of sentences in a review
The PMI is estimated by issuing queries to a search engine and gets promoted to PMI-IR - the derived estimation of SO from (1) and (2) is
The review is classified by calculating the positive/negative average.
The results bring interesting reflections on the challenges of sentiment analysis. The classifier does a good job with automobile reviews and a poor one with movie reviews. This might be because the quality of a car seems to be a function of the quality of its parts, yet a good director does not necessarily make for a good movie. Turney points us to examples such as this one:
Well as usual Keanu Reeves is nothing special, but surprisingly, the very talented Laurence Fishbourne is not so good either, I was surprised.
This sentence does not contain any word that is in itself easily associable to negative sentiment. To correctly classify this review we have to rely on our understanding of the semantic dependence of words like special on polarity reversers like nothing. Turney’s classifier is mislead by very talented and classifies the review as positive.
It is naive to think that OM can be done by keywords alone: sentiment is a subtle phenomena. Positive features can be apparent to a human and obscure to a classifier as much as the opposite (the feature still has been found a good indicator of positivity - so you can run and tell that, luddites!). Term presence has been shown to be more significant than term frequency; a certain token in a certain position can effect the overall status of subjectivity in the text. Parts of speech such as adjectives are strongly correlated to subjective content; patterns can be useful to disambiguate opinion.
I decided to tumblog short notes on my MA thesis. This blog will be a digest of related literature and experiments, mostly in the form of “notes to self”. If your interests intersect with language technology and computer science, please be gracious about the approximations contained within it: I aim solely to track my own progress and to force myself to write in a presentable manner. I will try to keep the (phdcomics|xkcd)-linking to the bare essential.