Results 1 - 10
of
25
Thumbs up? Sentiment Classification using Machine Learning Techniques
- IN PROCEEDINGS OF EMNLP
, 2002
"... We consider the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative. Using movie reviews as data, we find that standard machine learning techniques definitively outperform human-produced baselines. However, the three mac ..."
Abstract
-
Cited by 377 (4 self)
- Add to MetaCart
We consider the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative. Using movie reviews as data, we find that standard machine learning techniques definitively outperform human-produced baselines. However, the three machine learning methods we employed (Naive Bayes, maximum entropy classification, and support vector machines) do not perform as well on sentiment classification as on traditional topic-based categorization. We conclude by examining factors that make the sentiment classification problem more challenging. 1
Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews
, 2003
"... The web contains a wealth of product reviews, but sifting through them is a daunting task. Ideally, an opinion mining tool would process a set of search results for a given item, generating a list of product attributes (quality, features, etc.) and aggregating opinions about each of them (poor, mixe ..."
Abstract
-
Cited by 204 (0 self)
- Add to MetaCart
The web contains a wealth of product reviews, but sifting through them is a daunting task. Ideally, an opinion mining tool would process a set of search results for a given item, generating a list of product attributes (quality, features, etc.) and aggregating opinions about each of them (poor, mixed, good). We begin by identifying the unique properties of this problem and develop a method for automatically distinguishing between positive and negative reviews. Our classifier draws on information retrieval techniques for feature extraction and scoring, and the results for various metrics and heuristics vary depending on the testing situation. The best methods work as well as or better than traditional machine learning. When operating on individual sentences collected from web searches, performance is limited due to noise and ambiguity. But in the context of a complete web-based tool and aided by a simple method for grouping sentences into attributes, the results are qualitatively quite useful.
Opinion Mining and Sentiment Analysis
"... An important part of our information-gathering behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new opportunities and challenges arise as people now can, and do, active ..."
Abstract
-
Cited by 149 (3 self)
- Add to MetaCart
An important part of our information-gathering behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new opportunities and challenges arise as people now can, and do, actively use information technologies to seek out and understand the opinions of others. The sudden eruption of activity in the area of opinion mining and sentiment analysis, which deals with the computational treatment of opinion, sentiment, and subjectivity in text, has thus occurred at least in part as a direct response to the surge of interest in new systems that deal directly with opinions as a first-class object. This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems. Our focus is on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis. We include materialon summarization of evaluative text and on broader issues regarding privacy, manipulation, and economic impact that the development of opinion-oriented information-access services gives rise to. To facilitate future work, a discussion of available resources, benchmark datasets, and evaluation campaigns is also provided. 1
Annotating expressions of opinions and emotions in language. Language Resources and Evaluation
- Language Resources and Evaluation (formerly Computers and the Humanities
, 2005
"... Abstract. This paper describes a corpus annotation project to study issues in the manual annotation of opinions, emotions, sentiments, speculations, evaluations and other private states in language. The resulting corpus annotation scheme is described, as well as examples of its use. In addition, the ..."
Abstract
-
Cited by 90 (13 self)
- Add to MetaCart
Abstract. This paper describes a corpus annotation project to study issues in the manual annotation of opinions, emotions, sentiments, speculations, evaluations and other private states in language. The resulting corpus annotation scheme is described, as well as examples of its use. In addition, the manual annotation process and the results of an inter-annotator agreement study on a 10,000-sentence corpus of articles drawn from the world press are presented.
Learning Subjective Nouns Using Extraction Pattern Bootstrapping
, 2003
"... We explore the idea of creating a subjectivity classifier that uses lists of subjective nouns learned by bootstrapping algorithms. The goal of our research is to develop a system that can distinguish subjective sentences from objective sentences. First, we use two bootstrapping algorithms that ..."
Abstract
-
Cited by 89 (5 self)
- Add to MetaCart
We explore the idea of creating a subjectivity classifier that uses lists of subjective nouns learned by bootstrapping algorithms. The goal of our research is to develop a system that can distinguish subjective sentences from objective sentences. First, we use two bootstrapping algorithms that exploit extraction patterns to learn sets of subjective nouns. Then we train a Naive Bayes classifier using the subjective nouns, discourse features, and subjectivity clues identified in prior research. The bootstrapping algorithms learned over 1000 subjective nouns, and the subjectivity classifier performed well, achieving 77% recall with 81% precision.
Creating Subjective and Objective Sentence Classifiers from Unannotated Texts
- INTELLIGENT TEXT PROCESSING (CICLING-05)
, 2005
"... This paper presents the results of developing subjectivity classifiers using only unannotated texts for training. The performance rivals that of previous supervised learning approaches. In addition, we advance the state of the art in objective sentence classification by learning extraction patterns ..."
Abstract
-
Cited by 63 (5 self)
- Add to MetaCart
This paper presents the results of developing subjectivity classifiers using only unannotated texts for training. The performance rivals that of previous supervised learning approaches. In addition, we advance the state of the art in objective sentence classification by learning extraction patterns associated with objectivity and creating objective classifiers that achieve substantially higher recall than previous work with comparable precision.
Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis
- In COLING
, 2005
"... We demonstrate that it is possible to perform automatic sentiment classification in the very noisy domain of customer feedback data. We show that by using large feature vectors in combination with feature reduction, we can train linear support vector machines that achieve high classification accurac ..."
Abstract
-
Cited by 25 (0 self)
- Add to MetaCart
We demonstrate that it is possible to perform automatic sentiment classification in the very noisy domain of customer feedback data. We show that by using large feature vectors in combination with feature reduction, we can train linear support vector machines that achieve high classification accuracy on data that present classification challenges even for a human annotator. We also show that, surprisingly, the addition of deep linguistic analysis features to a set of surface level word n-gram features contributes consistently to classification accuracy in this domain. 1
Towards a robust metric of opinion
- In AAAI Spring Symposium on Exploring Attitude and Affect in Text (2004
"... This paper describes an automated system for detecting polar expressions about a topic of interest. The two elementary components of this approach are a shallow NLP polar language extraction system and a machine learning based topic classifier. These components are composed together by making a simp ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
This paper describes an automated system for detecting polar expressions about a topic of interest. The two elementary components of this approach are a shallow NLP polar language extraction system and a machine learning based topic classifier. These components are composed together by making a simple but accurate collocation assumption: if a topical sentence contains polar language, the system predicts that the polar language is reflective of the topic, and not some other subject matter. We evaluate our system, components and assumption on a corpus of online consumer messages. Based on these components, we discuss how to measure the overall sentiment about a particular topic as expressed in online messages authored by many different people. We propose to use the fundamentals of Bayesian statistics to form an aggregate authorial opinion metric. This metric would propagate uncertainties introduced by the polarity and topic modules to facilitate statistically valid comparisons of opinion across multiple topics.
Recognizing strong and weak opinion clauses
- Computational Intelligence
, 2006
"... There has been a recent swell of interest in the automatic identification and extraction of opinions and emotions in text. In this paper, we present the first experimental results classifying the intensity of opinions and other types of subjectivity and classifying the subjectivity of deeply nested ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
There has been a recent swell of interest in the automatic identification and extraction of opinions and emotions in text. In this paper, we present the first experimental results classifying the intensity of opinions and other types of subjectivity and classifying the subjectivity of deeply nested clauses. We use a wide range of features, including new syntactic features developed for opinion recognition. We vary the learning algorithm and the feature organization to explore the effect this has on the classification task. In 10-fold cross-validation experiments using support vector re-gression, we achieve improvements in mean-squared error over baseline ranging from 49 % to 51%. Using boosting, we achieve improvements in accuracy ranging from 23 % to 96%.
Retrieving topical sentiments from online document collections
- In Document Recognition and Retrieval XI
, 2004
"... Retrieving documents by subject matter is the general goal of information retrieval and other content access systems. There are other aspects of textual content, however, which form equally valid selection critieria. One such aspect is that of sentiment or polarity- indicating the users opinion or e ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
Retrieving documents by subject matter is the general goal of information retrieval and other content access systems. There are other aspects of textual content, however, which form equally valid selection critieria. One such aspect is that of sentiment or polarity- indicating the users opinion or emotional relationship with some topic. Recent work in this area has treated polarity effectively as a discrete aspect of text. In this paper we present a lightweight but robust approach to combining topic and polarity thus enabling content access systems to select content based on a certain opinion about a certain topic. 1.

