Results 1 - 10
of
31
Opinion Mining and Sentiment Analysis
"... An important part of our information-gathering behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new opportunities and challenges arise as people now can, and do, active ..."
Abstract
-
Cited by 149 (3 self)
- Add to MetaCart
An important part of our information-gathering behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new opportunities and challenges arise as people now can, and do, actively use information technologies to seek out and understand the opinions of others. The sudden eruption of activity in the area of opinion mining and sentiment analysis, which deals with the computational treatment of opinion, sentiment, and subjectivity in text, has thus occurred at least in part as a direct response to the surge of interest in new systems that deal directly with opinions as a first-class object. This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems. Our focus is on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis. We include materialon summarization of evaluative text and on broader issues regarding privacy, manipulation, and economic impact that the development of opinion-oriented information-access services gives rise to. To facilitate future work, a discussion of available resources, benchmark datasets, and evaluation campaigns is also provided. 1
From frequency to meaning : Vector space models of semantics
- Journal of Artificial Intelligence Research
, 2010
"... Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are begi ..."
Abstract
-
Cited by 34 (0 self)
- Add to MetaCart
Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys the use of VSMs for semantic processing of text. We organize the literature on VSMs according to the structure of the matrix in a VSM. There are currently three broad classes of VSMs, based on term–document, word–context, and pair–pattern matrices, yielding three classes of applications. We survey a broad range of applications in these three categories and we take a detailed look at a specific open source project in each category. Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field. 1.
How Opinions are Received by Online Communities: A Case Study on Amazon.com Helpfulness Votes
"... There are many on-line settings in which users publicly express opinions. A number of these offer mechanisms for other users to evaluate these opinions; a canonical example is Amazon.com, where reviews come with annotations like “26 of 32 people found the following review helpful. ” Opinion evaluati ..."
Abstract
-
Cited by 17 (2 self)
- Add to MetaCart
There are many on-line settings in which users publicly express opinions. A number of these offer mechanisms for other users to evaluate these opinions; a canonical example is Amazon.com, where reviews come with annotations like “26 of 32 people found the following review helpful. ” Opinion evaluation appears in many off-line settings as well, including market research and political campaigns. Reasoning about the evaluation of an opinion is fundamentally different from reasoning about the opinion itself: rather than asking, “What did Y think of X?”, we are asking, “What did Z think of Y’s opinion of X? ” Here we develop a framework for analyzing and modeling opinion evaluation, using a large-scale collection of Amazon book reviews as a dataset. We find that the perceived helpfulness of a review depends not just on its content but also but also in subtle ways on how the expressed evaluation relates to other evaluations of the same product. As part of our approach, we develop novel methods that take advantage of the phenomenon of review “plagiarism ” to control for the effects of text in opinion evaluation, and we provide a simple and natural mathematical model consistent with our findings. Our analysis also allows us to distinguish among the predictions of competing theories from sociology and social psychology, and to discover unexpected differences in the collective opinion-evaluation behavior of user populations from different countries.
Sentiment analysis and subjectivity
- Handbook of Natural Language Processing, Second Edition. Taylor and Francis Group, Boca
, 2010
"... Textual information in the world can be broadly categorized into two main types: facts and opinions. Facts are objective expressions about entities, events and their properties. Opinions are usually subjective expressions that describe people’s sentiments, appraisals or feelings toward entities, eve ..."
Abstract
-
Cited by 17 (6 self)
- Add to MetaCart
Textual information in the world can be broadly categorized into two main types: facts and opinions. Facts are objective expressions about entities, events and their properties. Opinions are usually subjective expressions that describe people’s sentiments, appraisals or feelings toward entities, events and their properties. The concept of opinion is very broad. In this chapter, we only focus on opinion expressions that convey people’s positive or negative sentiments. Much of the existing research on textual information processing has been focused on mining and retrieval of factual information, e.g., information retrieval, Web search, text classification, text clustering and many other text mining and natural language processing tasks. Little work had been done on the processing of opinions until only recently. Yet, opinions are so important that whenever we need to make a decision we want to hear others ’ opinions. This is not only true for individuals but also true for organizations. One of the main reasons for the lack of study on opinions is the fact that there was little opinionated text available before the World Wide Web. Before the Web, when an individual needed to make a decision, he/she typically asked for opinions from friends and families. When an organization wanted to find the opinions or sentiments of the general public about its products and services, it conducted opinion polls, surveys, and focus groups. However, with the Web, especially with the explosive growth of the usergenerated
Finding Deceptive Opinion Spam by Any Stretch of the Imagination
"... Consumers increasingly rate, review and research products online (Jansen, 2010; Litvin et al., 2008). Consequently, websites containing consumer reviews are becoming targets of opinion spam. While recent work has focused primarily on manually identifiable instances of opinion spam, in this work we s ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
Consumers increasingly rate, review and research products online (Jansen, 2010; Litvin et al., 2008). Consequently, websites containing consumer reviews are becoming targets of opinion spam. While recent work has focused primarily on manually identifiable instances of opinion spam, in this work we study deceptive opinion spam—fictitious opinions that have been deliberately written to sound authentic. Integrating work from psychology and computational linguistics, we develop and compare three approaches to detecting deceptive opinion spam, and ultimately develop a classifier that is nearly 90 % accurate on our gold-standard opinion spam dataset. Based on feature analysis of our learned models, we additionally make several theoretical contributions, including revealing a relationship between deceptive opinions and imaginative writing. 1
REVRANK: a Fully Unsupervised Algorithm for Selecting the Most Helpful Book Reviews
"... www.cs.huji.ac.il/∼arir We present an algorithm for automatically ranking usergenerated book reviews according to review helpfulness. Given a collection of reviews, our REVRANK algorithm identifies a lexicon of dominant terms that constitutes the core of a virtual optimal review. This lexicon define ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
www.cs.huji.ac.il/∼arir We present an algorithm for automatically ranking usergenerated book reviews according to review helpfulness. Given a collection of reviews, our REVRANK algorithm identifies a lexicon of dominant terms that constitutes the core of a virtual optimal review. This lexicon defines a feature vector representation. Reviews are then converted to this representation and ranked according to their distance from a ‘virtual core ’ review vector. The algorithm is fully unsupervised and thus avoids costly and error-prone manual training annotations. Our experiments show that REVRANK clearly outperforms a baseline imitating the Amazon user vote review ranking system.
Exploiting Social Context for Review Quality Prediction
"... Online reviews in which users publish detailed commentary about their experiences and opinions with products, services, or events are extremely valuable to users who rely on them to make informed decisions. However, reviews vary greatly in quality and are constantly increasing in number, therefore, ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Online reviews in which users publish detailed commentary about their experiences and opinions with products, services, or events are extremely valuable to users who rely on them to make informed decisions. However, reviews vary greatly in quality and are constantly increasing in number, therefore, automatic assessment of review helpfulness is of growing importance. Previous work has addressed the problem by treating a review as a stand-alone document, extracting features from the review text, and learning a function based on these features for predicting the review quality. In this work, we exploit contextual information about authors ’ identities and social networks for improving review quality prediction. We propose a generic framework for incorporating social context information by adding regularization constraints to the text-based predictor. Our approach can effectively use the social context information available for large amount of unlabeled reviews. It also has the advantage that the resulting predictor is usable even when social context is unavailable. We validate our framework within a real commerce portal and experimentally demonstrate that using social context information can help improve the accuracy of review quality prediction especially when the available training data is sparse.
Efficient Confident Search in Large Review Corpora
"... Abstract. Given an extensive corpus of reviews on an item, a potential customer goes through the expressed opinions and collects information, in order to form an educated opinion and, ultimately, make a purchase decision. This task is often hindered by false reviews, that fail to capture the true qu ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Abstract. Given an extensive corpus of reviews on an item, a potential customer goes through the expressed opinions and collects information, in order to form an educated opinion and, ultimately, make a purchase decision. This task is often hindered by false reviews, that fail to capture the true quality of the item’s attributes. These reviews may be based on insufficient information or may even be fraudulent, submitted to manipulate the item’s reputation. In this paper, we formalize the Confident Search paradigm for review corpora. We then present a complete search framework which, given a set of item attributes, is able to efficiently search through a large corpus and select a compact set of high-quality reviews that accurately captures the overall consensus of the reviewers on the specified attributes. We also introduce CREST (Confident REview Search Tool), a user-friendly implementation of our framework and a valuable tool for any person dealing with large review corpora. The efficacy of our framework is demonstrated through a rigorous experimental evaluation. 1
If We Want Your Opinion
, 2007
"... Sentiment has traditionally been considered a “deep” attribute of writing, often requiring the interpretation of figurative language to uncover the writer’s intention. The natural language processing community has become increasingly interested in detecting, through automatic means, the expression o ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Sentiment has traditionally been considered a “deep” attribute of writing, often requiring the interpretation of figurative language to uncover the writer’s intention. The natural language processing community has become increasingly interested in detecting, through automatic means, the expression of opinions and measuring the intensity of emotions held by the writer. Despite the depth and abstraction often associated with expressions of sentiment, we apply strictly lexical analysis to the opinions expressed about books and find that machine learning techniques are capable of resolving even fine-grained distinctions between opinions. Using an averaged perceptron classifier trained using a word subsequence kernel, we achieve an accuracy of 89 % when distinguishing between 1- and 5-star reviews. Further, this same model yields significant separation when scoring intermediate reviews—making distinctions even human annotators find difficult. We detail the collection of data for supervised training and present the results of our sentiment classifier along with some discussion about why we believe this approach to be effective.
Exploiting Surface Features for the Prediction of Podcast Preference
"... Abstract. Podcasts display an unevenness characteristic of domains dominated by user generated content, resulting in potentially radical variation of the user preference they enjoy. We report on work that uses easily extractable surface features of podcasts in order to achieve solid performance on t ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Abstract. Podcasts display an unevenness characteristic of domains dominated by user generated content, resulting in potentially radical variation of the user preference they enjoy. We report on work that uses easily extractable surface features of podcasts in order to achieve solid performance on two podcast preference prediction tasks: classification of preferred vs. non-preferred podcasts and ranking podcasts by level of preference. We identify features with good discriminative potential by carrying out manual data analysis, resulting in a refinement of the indicators of an existent podcast preference framework. Our preference prediction is useful for topic-independent ranking of podcasts, and can be used to support download suggestion or collection browsing. 1

