Results 1 - 10
of
27
Movie review mining and summarization
- In Proceedings of the International Conference on Information and Knowledge Management (CIKM
, 2006
"... With the flourish of the Web, online review is becoming a more and more useful and important information resource for people. As a result, automatic review mining and summarization has become a hot research topic recently. Different from traditional text summarization, review mining and summarizatio ..."
Abstract
-
Cited by 37 (1 self)
- Add to MetaCart
With the flourish of the Web, online review is becoming a more and more useful and important information resource for people. As a result, automatic review mining and summarization has become a hot research topic recently. Different from traditional text summarization, review mining and summarization aims at extracting the features on which the reviewers express their opinions and determining whether the opinions are positive or negative. In this paper, we focus on a specific domain – movie review. A multi-knowledge based approach is proposed, which integrates WordNet, statistical analysis and movie knowledge. The experimental results show the effectiveness of the proposed approach in movie review mining and summarization.
Emotions from text: Machine learning for text-based emotion prediction
- In Proceedings of HLT/EMNLP
, 2005
"... In addition to information, text contains attitudinal, and more specifically, emotional content. This paper explores the text-based emotion prediction problem empirically, using supervised machine learning with the SNoW learning architecture. The goal is to classify the emotional affinity of sentenc ..."
Abstract
-
Cited by 28 (0 self)
- Add to MetaCart
In addition to information, text contains attitudinal, and more specifically, emotional content. This paper explores the text-based emotion prediction problem empirically, using supervised machine learning with the SNoW learning architecture. The goal is to classify the emotional affinity of sentences in the narrative domain of children’s fairy tales, for subsequent usage in appropriate expressive rendering of text-to-speech synthesis. Initial experiments on a preliminary data set of 22 fairy tales show encouraging results over a naïve baseline and BOW approach for classification of emotional versus non-emotional contents, with some dependency on parameter tuning. We also discuss results for a tripartite model which covers emotional valence, as well as feature set alternations. In addition, we present plans for a more cognitively sound sequential model, taking into consideration a larger set of basic emotions. 1
Which side are you on?: identifying perspectives at the document and sentence levels
- In Proceedings of the Tenth Conference on Computational Natural Language Learning
, 2006
"... In this paper we investigate a new problem of identifying the perspective from which a document is written. By perspective we mean a point of view, for example, from the perspective of Democrats or Republicans. Can computers learn to identify the perspective of a document? Not every sentence is writ ..."
Abstract
-
Cited by 26 (3 self)
- Add to MetaCart
In this paper we investigate a new problem of identifying the perspective from which a document is written. By perspective we mean a point of view, for example, from the perspective of Democrats or Republicans. Can computers learn to identify the perspective of a document? Not every sentence is written strongly from a perspective. Can computers learn to identify which sentences strongly convey a particular perspective? We develop statistical models to capture how perspectives are expressed at the document and sentence levels, and evaluate the proposed models on articles about the Israeli-Palestinian conflict. The results show that the proposed models successfully learn how perspectives are reflected in word usage and can identify the perspective of a document with high accuracy. 1
Feature Subsumption for Opinion Analysis
- In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP-06
, 2006
"... Lexical features are key to many approaches to sentiment analysis and opinion detection. A variety of representations have been used, including single words, multi-word Ngrams, phrases, and lexicosyntactic patterns. In this paper, we use a subsumption hierarchy to formally define different types of ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
Lexical features are key to many approaches to sentiment analysis and opinion detection. A variety of representations have been used, including single words, multi-word Ngrams, phrases, and lexicosyntactic patterns. In this paper, we use a subsumption hierarchy to formally define different types of lexical features and their relationship to one another, both in terms of representational coverage and performance. We use the subsumption hierarchy in two ways: (1) as an analytic tool to automatically identify complex features that outperform simpler features, and (2) to reduce a feature set by removing unnecessary features. We show that reducing the feature set improves performance on three opinion classification tasks, especially when combined with traditional feature selection. 1
Using bilingual knowledge and ensemble techniques for unsupervised Chinese sentiment analysis
- In Proceedings of EMNLP
, 2008
"... It is a challenging task to identify sentiment polarity of Chinese reviews because the resources for Chinese sentiment analysis are limited. Instead of leveraging only monolingual Chinese knowledge, this study proposes a novel approach to leverage reliable English resources to improve Chinese sentim ..."
Abstract
-
Cited by 14 (2 self)
- Add to MetaCart
It is a challenging task to identify sentiment polarity of Chinese reviews because the resources for Chinese sentiment analysis are limited. Instead of leveraging only monolingual Chinese knowledge, this study proposes a novel approach to leverage reliable English resources to improve Chinese sentiment analysis. Rather than simply projecting English resources onto Chinese resources, our approach first translates Chinese reviews into English reviews by machine translation services, and then identifies the sentiment polarity of English reviews by directly leveraging English resources. Furthermore, our approach performs sentiment analysis for both Chinese reviews and English reviews, and then uses ensemble methods to combine the individual analysis results. Experimental results on a dataset of 886 Chinese product reviews demonstrate the effectiveness of the proposed approach. The individual analysis of the translated English reviews outperforms the individual analysis of the original Chinese reviews, and the combination of the individual analysis results further improves the performance. 1
A preliminary investigation into sentiment analysis of informal political discourse
- AAAI Symposium on Computational Approaches to Analysing Weblogs (AAAI-CAAW
, 2006
"... With the rise of weblogs and the increasing tendency of online publications to turn to message-board style reader feedback venues, informal political discourse is becoming an important feature of the intellectual landscape of the Internet, creating a challenging and worthwhile area for experimentati ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
With the rise of weblogs and the increasing tendency of online publications to turn to message-board style reader feedback venues, informal political discourse is becoming an important feature of the intellectual landscape of the Internet, creating a challenging and worthwhile area for experimentation in techniques for sentiment analysis. We describe preliminary statistical tests on a new dataset of political discussion group postings which indicate that posts made in direct response to other posts in a thread have a strong tendency to represent an opposing political viewpoint to the original post. We conclude that traditional text classification methods will be inadequate to the task of sentiment analysis in this domain, and that progress is to be made by exploiting information about how posters interact with each other.
Lexicon-Based Methods for Sentiment Analysis
"... We present a lexicon-based approach to extracting sentiment from text. The Semantic Orientation CALculator (SO-CAL) uses dictionaries of words annotated with their semantic orientation (polarity and strength), and incorporates intensification and negation. SO-CAL is applied to the polarity classific ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
We present a lexicon-based approach to extracting sentiment from text. The Semantic Orientation CALculator (SO-CAL) uses dictionaries of words annotated with their semantic orientation (polarity and strength), and incorporates intensification and negation. SO-CAL is applied to the polarity classification task, the process of assigning a positive or negative label to a text that captures the text’s opinion towards its main subject matter. We show that SO-CAL’s performance is consistent across domains and in completely unseen data. Additionally, we describe the process of dictionary creation, and our use of Mechanical Turk to check dictionaries for consistency and reliability. 1.
Are these documents written from different perspectives? A test of different perspectives based on statistical distribution divergence
- In Proceedings of ACL 2006
, 2006
"... In this paper we investigate how to automatically determine if two document collections are written from different perspectives. By perspectives we mean a point of view, for example, from the perspective of Democrats or Republicans. We propose a test of different perspectives based on distribution d ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
In this paper we investigate how to automatically determine if two document collections are written from different perspectives. By perspectives we mean a point of view, for example, from the perspective of Democrats or Republicans. We propose a test of different perspectives based on distribution divergence between the statistical models of two collections. Experimental results show that the test can successfully distinguish document collections of different perspectives from other types of collections. 1
Co-Training for Cross-Lingual Sentiment Classification
"... The lack of Chinese sentiment corpora limits the research progress on Chinese sentiment classification. However, there are many freely available English sentiment corpora on the Web. This paper focuses on the problem of cross-lingual sentiment classification, which leverages an available English cor ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
The lack of Chinese sentiment corpora limits the research progress on Chinese sentiment classification. However, there are many freely available English sentiment corpora on the Web. This paper focuses on the problem of cross-lingual sentiment classification, which leverages an available English corpus for Chinese sentiment classification by using the English corpus as training data. Machine translation services are used for eliminating the language gap between the training set and test set, and English features and Chinese features are considered as two independent views of the classification problem. We propose a cotraining approach to making use of unlabeled Chinese data. Experimental results show the effectiveness of the proposed approach, which can outperform the standard inductive classifiers and the transductive classifiers. 1
Examining the Role of Linguistic Knowledge Sources in the Automatic Identification and Classification of Reviews
"... This paper examines two problems in document-level sentiment analysis: (1) determining whether a given document is a review or not, and (2) classifying the polarity of a review as positive or negative. We first demonstrate that review identification can be performed with high accuracy using only uni ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
This paper examines two problems in document-level sentiment analysis: (1) determining whether a given document is a review or not, and (2) classifying the polarity of a review as positive or negative. We first demonstrate that review identification can be performed with high accuracy using only unigrams as features. We then examine the role of four types of simple linguistic knowledge sources in a polarity classification system. 1

