Results 1 - 10
of
36
Classifying Sentiment in Microblogs: is Brevity an Advantage?
- In Proc. of CIKM,
, 2010
"... ABSTRACT Microblogs as a new textual domain offer a unique proposition for sentiment analysis. Their short document length suggests any sentiment they contain is compact and explicit. However, this short length coupled with their noisy nature can pose difficulties for standard machine learning docu ..."
Abstract
-
Cited by 50 (1 self)
- Add to MetaCart
(Show Context)
ABSTRACT Microblogs as a new textual domain offer a unique proposition for sentiment analysis. Their short document length suggests any sentiment they contain is compact and explicit. However, this short length coupled with their noisy nature can pose difficulties for standard machine learning document representations. In this work we examine the hypothesis that it is easier to classify the sentiment in these short form documents than in longer form documents. Surprisingly, we find classifying sentiment in microblogs easier than in blogs and make a number of observations pertaining to the challenge of supervised learning for sentiment analysis in microblogs.
Generalizing Dependency Features for Opinion Mining
"... We explore how features based on syntactic dependency relations can be utilized to improve performance on opinion mining. Using a transformation of dependency relation triples, we convert them into “composite back-off features ” that generalize better than the regular lexicalized dependency relation ..."
Abstract
-
Cited by 33 (1 self)
- Add to MetaCart
(Show Context)
We explore how features based on syntactic dependency relations can be utilized to improve performance on opinion mining. Using a transformation of dependency relation triples, we convert them into “composite back-off features ” that generalize better than the regular lexicalized dependency relation features. Experiments comparing our approach with several other approaches that generalize dependency features or ngrams demonstrate the utility of composite back-off features. 1
Latent variable models for semantic orientation of phrases
, 2006
"... We propose models for semantic orientations of phrases as well as classification methods based on the models. Although each phrase consists of multiple words, the semantic orientation of the phrase is not a mere sum of the orientations of the component words. Some words can invert the orientation. I ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
We propose models for semantic orientations of phrases as well as classification methods based on the models. Although each phrase consists of multiple words, the semantic orientation of the phrase is not a mere sum of the orientations of the component words. Some words can invert the orientation. In order to capture the property of such phrases, we introduce latent variables into the models. Through experiments, we show that the proposed latent variable models work well in the classification of semantic orientations of phrases and achieved nearly 82 % classification accuracy. 1
Learning to shift the polarity of words for sentiment classification
- In Proceedings of the Third International Joint Conference on Natural Language Processing (IJCNLP
, 2008
"... We propose a machine learning based method of sentiment classification of sentences using word-level polarity. The polarities of words in a sentence are not always the same as that of the sentence, because there can be polarity-shifters such as negation expressions. The proposed method models the po ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
(Show Context)
We propose a machine learning based method of sentiment classification of sentences using word-level polarity. The polarities of words in a sentence are not always the same as that of the sentence, because there can be polarity-shifters such as negation expressions. The proposed method models the polarity-shifters. Our model can be trained in two different ways: word-wise and sentence-wise learning. In sentence-wise learning, the model can be trained so that the prediction of sentence polarities should be accurate. The model can also be combined with features used in previous work such as bag-of-words and n-grams. We empirically show that our method almost always improves the performance of sentiment classification of sentences especially when we have only small amount of training data. 1
Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis
"... This paper describes an approach to utilizing term weights for sentiment analysis tasks and shows how various term weighting schemes improve the performance of sentiment analysis systems. Previously, sentiment analysis was mostly studied under data-driven and lexicon-based frameworks. Such work gene ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
(Show Context)
This paper describes an approach to utilizing term weights for sentiment analysis tasks and shows how various term weighting schemes improve the performance of sentiment analysis systems. Previously, sentiment analysis was mostly studied under data-driven and lexicon-based frameworks. Such work generally exploits textual features for fact-based analysis tasks or lexical indicators from a sentiment lexicon. We propose to model term weighting into a sentiment analysis system utilizing collection statistics, contextual and topicrelated characteristics as well as opinionrelated properties. Experiments carried out on various datasets show that our approach effectively improves previous methods. 1
Opinion Mining from Web documents: Extraction and Structurization
, 2007
"... This dissertation deals with the task of extracting customer opinions from web documents. This task is the key component of opinion mining, which allows Web users to retrieve and summarize people’s opinions scattered over Web documents. Our aim is to develop a method for extracting opinions, that re ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
This dissertation deals with the task of extracting customer opinions from web documents. This task is the key component of opinion mining, which allows Web users to retrieve and summarize people’s opinions scattered over Web documents. Our aim is to develop a method for extracting opinions, that represent evaluation of consumer products, in a structured form. In this dissertation, we approaches opinion extraction by addressing the following two unexplored issues: how to define the task of opinion extraction and how to extract the structured opinions. Based on a corpus study, we define an opinion unit consisting of a quadruple, that is, the opinion holder, the subject being evaluated (Subject), the part or the attribute in which it is evaluated (Aspect), and the evaluation that expresses a positive or negative assessment (Evaluation). We use this definition as a basis for our opinion extraction task. For the second issue, we divide this task into two subtasks: (a) extracting relations between subjects/aspects and evaluations, and (b) extracting relations between subjects/aspects and aspects. Firstly, we consider the approach to ex-tract these relations using a list of expressions which possibly describe subjects, aspects or evaluations. We propose a semi-automatic method for collecting aspect/evaluation expressions, which uses particular cooccurrence patterns of sub-jects, aspects and evaluations. Our semi-automatic method can collect these
SENTIMENT CLASSIFICATION OF MOVIE REVIEWS USING LINGUISTIC PARSING
"... The problem of sentiment analysis requires a deeper understanding of the English language than previously established techniques in the field obtain. The Linguistic Tree Transformation Algorithm is introduced as a method to exploit the syntactical dependencies between words in a sentence and to disa ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
The problem of sentiment analysis requires a deeper understanding of the English language than previously established techniques in the field obtain. The Linguistic Tree Transformation Algorithm is introduced as a method to exploit the syntactical dependencies between words in a sentence and to disambiguate word senses. The algorithm is tested against the established Pang/Lee dataset and a new list of Roger Ebert reviews. A new method of objective sentence removal is also introduced to improve established methods of sentiment analysis against full reviews with no user extraction of objective sentences.
Identifying high-impact sub-structures for convolution kernels in document-level sentiment classification
- In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), ACL ’12
, 2012
"... Abstract Convolution kernels support the modeling of complex syntactic information in machinelearning tasks. However, such models are highly sensitive to the type and size of syntactic structure used. It is therefore an important challenge to automatically identify high impact sub-structures releva ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
(Show Context)
Abstract Convolution kernels support the modeling of complex syntactic information in machinelearning tasks. However, such models are highly sensitive to the type and size of syntactic structure used. It is therefore an important challenge to automatically identify high impact sub-structures relevant to a given task. In this paper we present a systematic study investigating (combinations of) sequence and convolution kernels using different types of substructures in document-level sentiment classification. We show that minimal sub-structures extracted from constituency and dependency trees guided by a polarity lexicon show 1.45 point absolute improvement in accuracy over a bag-of-words classifier on a widely used sentiment corpus.
The Haves and the Have-Nots: Leveraging Unlabelled Corpora for Sentiment Analysis
"... Expensive feature engineering based on WordNet senses has been shown to be useful for document level sentiment classification. A plausible reason for such a performance improvement is the reduction in data sparsity. However, such a reduction could be achieved with a lesser effort through the means o ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
(Show Context)
Expensive feature engineering based on WordNet senses has been shown to be useful for document level sentiment classification. A plausible reason for such a performance improvement is the reduction in data sparsity. However, such a reduction could be achieved with a lesser effort through the means of syntagma based word clustering. In this paper, the problem of data sparsity in sentiment analysis, both monolingual and cross-lingual, is addressed through the means of clustering. Experiments show that cluster based data sparsity reduction leads to performance better than sense based classification for sentiment analysis at document level. Similar idea is applied to Cross Lingual Sentiment Analysis (CLSA), and it is shown that reduction in data sparsity (after translation or bilingual-mapping) produces accuracy higher than Machine Translation based CLSA and sense based CLSA. 1