Results 1 - 10
of
69
Topic sentiment mixture: modeling facets and opinions in weblogs
- In Proc. of the 16th Int. Conference on World Wide Web
, 2007
"... In this paper, we define the problem of topic-sentiment analysis on Weblogs and propose a novel probabilistic model to capture the mixture of topics and sentiments simultaneously. The proposed Topic-Sentiment Mixture (TSM) model can reveal the latent topical facets in a Weblog collection, the subtop ..."
Abstract
-
Cited by 48 (7 self)
- Add to MetaCart
In this paper, we define the problem of topic-sentiment analysis on Weblogs and propose a novel probabilistic model to capture the mixture of topics and sentiments simultaneously. The proposed Topic-Sentiment Mixture (TSM) model can reveal the latent topical facets in a Weblog collection, the subtopics in the results of an ad hoc query, and their associated sentiments. It could also provide general sentiment models that are applicable to any ad hoc topics. With a specifically designed HMM structure, the sentiment models and topic models estimated with TSM can be utilized to extract topic life cycles and sentiment dynamics. Empirical experiments on different Weblog datasets show that this approach is effective for modeling the topic facets and sentiments and extracting their dynamics from Weblog collections. The TSM model is quite general; it can be applied to any text collections with a mixture of topics and sentiments, thus has many potential applications, such as search result summarization, opinion tracking, and user behavior prediction.
A Holistic Lexicon-Based Approach to Opinion Mining
, 2008
"... One of the important types of information on the Web is the opinions expressed in the user generated content, e.g., customer reviews of products, forum posts, and blogs. In this paper, we focus on customer reviews of products. In particular, we study the problem of determining the semantic orientati ..."
Abstract
-
Cited by 37 (7 self)
- Add to MetaCart
One of the important types of information on the Web is the opinions expressed in the user generated content, e.g., customer reviews of products, forum posts, and blogs. In this paper, we focus on customer reviews of products. In particular, we study the problem of determining the semantic orientations (positive, negative or neutral) of opinions expressed on product features in reviews. This problem has many applications, e.g., opinion mining, summarization and search. Most existing techniques utilize a list of opinion (bearing) words (also called opinion lexicon) for the purpose. Opinion words are words that express desirable (e.g., great, amazing, etc.) or undesirable (e.g., bad, poor, etc) states. These approaches, however, all have some major shortcomings. In this paper, we propose a holistic lexicon-based approach to solving the problem by exploiting external evidences and linguistic conventions of natural language expressions. This approach allows the system to handle opinion words that are context dependent, which cause major difficulties for existing algorithms. It also deals with many special words, phrases and language constructs which have impacts on opinions based on their linguistic patterns. It also has an effective function for aggregating multiple conflicting opinion words in a sentence. A system, called Opinion Observer, based on the proposed technique has been implemented. Experimental results using a benchmark product review data set and some additional reviews show that the proposed technique is highly effective. It outperforms existing methods significantly.
Twitter Power: Tweets as Electronic Word of Mouth
"... In this paper we report research results investigating microblogging as a form of electronic word-of-mouth for sharing consumer opinions concerning brands. We analyzed more than 150,000 microblog postings containing branding comments, sentiments, and opinions.We investigated the overall structure of ..."
Abstract
-
Cited by 29 (1 self)
- Add to MetaCart
In this paper we report research results investigating microblogging as a form of electronic word-of-mouth for sharing consumer opinions concerning brands. We analyzed more than 150,000 microblog postings containing branding comments, sentiments, and opinions.We investigated the overall structure of these microblog postings, the types of expressions, and the movement in positive or negative sentiment.We compared automated methods of classifying sentiment in these microblogs with manual coding. Using a case study approach, we analyzed the range, frequency, timing, and content of tweets in a corporate account. Our research findings show that 19% of microblogs contain mention of a brand. Of the branding microblogs, nearly 20 % contained some expression of brand sentiments. Of these, more than 50 % were positive and 33 % were critical of the company or product. Our comparison of automated and manual coding showed no significant differences between the two approaches. In analyzing microblogs for structure and composition, the linguistic structure of tweets approximate the linguistic patterns of natural language expressions. We find that microblogging is an online tool for customer word of mouth communications and discuss the implications for corporations using microblogging as part of their overall marketing strategy.
Predicting movie sales from blogger sentiment
- In AAAI 2006 Spring Symposium on Computational Approaches to Analysing Weblogs (AAAI-CAAW
, 2006
"... The volume of discussion about a product in weblogs has recently been shown to correlate with the product’s financial performance. In this paper, we study whether applying sentiment analysis methods to weblog data results in better correlation than volume only, in the domain of movies. Our main find ..."
Abstract
-
Cited by 27 (1 self)
- Add to MetaCart
The volume of discussion about a product in weblogs has recently been shown to correlate with the product’s financial performance. In this paper, we study whether applying sentiment analysis methods to weblog data results in better correlation than volume only, in the domain of movies. Our main finding is that positive sentiment is indeed a better predictor for movie success when applied to a limited context around references to the movie in weblogs, posted prior to its release. If my film makes one more person miserable, I’ve done my job. – Woody Allen
Identifying comparative sentences in text documents
- In Proc. of the 29th SIGIR
, 2006
"... This paper studies the problem of identifying comparative sentences in text documents. The problem is related to but quite different from sentiment/opinion sentence identification or classification. Sentiment classification studies the problem of classifying a document or a sentence based on the sub ..."
Abstract
-
Cited by 25 (2 self)
- Add to MetaCart
This paper studies the problem of identifying comparative sentences in text documents. The problem is related to but quite different from sentiment/opinion sentence identification or classification. Sentiment classification studies the problem of classifying a document or a sentence based on the subjective opinion of the author. An important application area of sentiment/opinion identification is business intelligence as a product manufacturer always wants to know consumers ’ opinions on its products. Comparisons on the other hand can be subjective or objective. Furthermore, a comparison is not concerned with an object in isolation. Instead, it compares the object with others. An example opinion sentence is “the sound quality of CD player X is poor”. An example comparative sentence is “the sound quality of CD player X is not as good as that of CD player Y”. Clearly, these two sentences give different information. Their language constructs are quite different too. Identifying comparative sentences is also useful in practice because direct comparisons are perhaps one of the most convincing ways of evaluation, which may even be more important than opinions on each individual object. This paper proposes to study the comparative sentence identification problem. It first categorizes comparative sentences into different types, and then presents a novel integrated pattern discovery and supervised learning approach to identifying comparative sentences from text documents. Experiment results using three types of documents, news articles, consumer reviews of products, and Internet forum postings, show a precision of 79% and recall of 81%. More detailed results are given in the paper.
Opinion Integration Through Semi-supervised Topic Modeling
- WWW 2008
, 2008
"... Web 2.0 technology has enabled more and more people to freely express their opinions on the Web, making the Web an extremely valuable source for mining user opinions about all kinds of topics. In this paper we study how to automatically integrate opinions expressed in a well-written expert review wi ..."
Abstract
-
Cited by 24 (4 self)
- Add to MetaCart
Web 2.0 technology has enabled more and more people to freely express their opinions on the Web, making the Web an extremely valuable source for mining user opinions about all kinds of topics. In this paper we study how to automatically integrate opinions expressed in a well-written expert review with lots of opinions scattering in various sources such as blogspaces and forums. We formally define this new integration problem and propose to use semi-supervised topic models to solve the problem in a principled way. Experiments on integrating opinions about two quite different topics (a product and a political figure) show that the proposed method is effective for both topics and can generate useful aligned integrated opinion summaries. The proposed method is quite general. It can be used to integrate a well written review with opinions in an arbitrary text collection about any topic to potentially support many interesting applications in multiple domains.
2006. Joint extraction of entities and relations for opinion recognition
- In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP
, 2006
"... We present an approach for the joint extraction of entities and relations in the context of opinion recognition and analysis. We identify two types of opinion-related entities — expressions of opinions and sources of opinions — along with the linking relation that exists between them. Inspired by Ro ..."
Abstract
-
Cited by 23 (3 self)
- Add to MetaCart
We present an approach for the joint extraction of entities and relations in the context of opinion recognition and analysis. We identify two types of opinion-related entities — expressions of opinions and sources of opinions — along with the linking relation that exists between them. Inspired by Roth and Yih (2004), we employ an integer linear programming approach to solve the joint opinion recognition task, and show that global, constraint-based inference can significantly boost the performance of both relation extraction and the extraction of opinion-related entities. Performance further improves when a semantic role labeling system is incorporated. The resulting system achieves F-measures of 79 and 69 for entity and relation extraction, respectively, improving substantially over prior results in the area. 1
Learning Document-Level Semantic Properties from Free-text Annotations
"... This paper demonstrates a new method for leveraging unstructured annotations to infer semantic document properties. We consider the domain of product reviews, which are often annotated by their authors with free-text keyphrases, such as “a real bargain ” or “good value. ” We leverage these unstructu ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
This paper demonstrates a new method for leveraging unstructured annotations to infer semantic document properties. We consider the domain of product reviews, which are often annotated by their authors with free-text keyphrases, such as “a real bargain ” or “good value. ” We leverage these unstructured annotations by clustering them into semantic properties, and then tying the induced clusters to hidden topics in the document text. This allows us to predict relevant properties of unannotated documents. Our approach is implemented in a hierarchical Bayesian model with joint inference, which increases the robustness of the keyphrase clustering and encourages document topics to correlate with semantically meaningful properties. We perform several evaluations of our model, and find that it substantially outperforms alternative approaches. 1
me the money! Deriving the pricing power of product features by mining consumer reviews
- In Proceedings of the Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2007
, 2007
"... The increasing pervasiveness of the Internet has dramatically changed the way that consumers shop for goods. Consumergenerated product reviews have become a valuable source of information for customers, who read the reviews and decide whether to buy the product based on the information provided. In ..."
Abstract
-
Cited by 18 (3 self)
- Add to MetaCart
The increasing pervasiveness of the Internet has dramatically changed the way that consumers shop for goods. Consumergenerated product reviews have become a valuable source of information for customers, who read the reviews and decide whether to buy the product based on the information provided. In this paper, we use techniques that decompose the reviews into segments that evaluate the individual characteristics of a product (e.g., image quality and battery life for a digital camera). Then, as a major contribution of this paper, we adapt methods from the econometrics literature, specifically the hedonic regression concept, to estimate: (a) the weight that customers place on each individual product feature, (b) the implicit evaluation score that customers assign to each feature, and (c) how these evaluations affect the revenue for a given product. Towards this goal, we develop a novel hybrid technique combining text mining and econometrics that models consumer product reviews as elements in a tensor product of feature and evaluation spaces. We then impute the quantitative impact of consumer reviews on product demand as a linear functional from this tensor product space. We demonstrate how to use a lowdimension approximation of this functional to significantly reduce the number of model parameters, while still providing good experimental results. We evaluate our technique using a data set from Amazon.com consisting of sales data and the related consumer reviews posted over a 15-month period for 242 products. Our experimental evaluation shows that we can extract actionable business intelligence from the data and better understand the customer preferences and actions. We also show that the textual portion of the reviews can improve product sales prediction compared to a baseline technique that simply relies on numeric data.
Modeling trust and influence in the blogosphere using link polarity
- In Proceedings of the International Conference on Weblogs and Social Media (ICWSM
, 2007
"... There is a growing interest in social network analysis to explore how communities and individuals spread influence. We describe techniques to find "like minded " blogs based on blog-to-blog link sentiment for a particular domain. Using simple sentiment detection techniques, we identify the ..."
Abstract
-
Cited by 17 (4 self)
- Add to MetaCart
There is a growing interest in social network analysis to explore how communities and individuals spread influence. We describe techniques to find "like minded " blogs based on blog-to-blog link sentiment for a particular domain. Using simple sentiment detection techniques, we identify the polarity (positive, negative or neutral) of the text surrounding links that point from one blog post to another. We use trust propagation models to spread this sentiment from a subset of connected blogs to other blogs and deduce like-minded blogs in the blog graph. Our techniques demonstrate the potential of using polar links for more generic problems such as detecting trustworthy nodes in web graphs.

