Results 1 - 10
of
44
Lexicon-Based Methods for Sentiment Analysis
"... We present a lexicon-based approach to extracting sentiment from text. The Semantic Orientation CALculator (SO-CAL) uses dictionaries of words annotated with their semantic orientation (polarity and strength), and incorporates intensification and negation. SO-CAL is applied to the polarity classific ..."
Abstract
-
Cited by 182 (13 self)
- Add to MetaCart
We present a lexicon-based approach to extracting sentiment from text. The Semantic Orientation CALculator (SO-CAL) uses dictionaries of words annotated with their semantic orientation (polarity and strength), and incorporates intensification and negation. SO-CAL is applied to the polarity classification task, the process of assigning a positive or negative label to a text that captures the text’s opinion towards its main subject matter. We show that SO-CAL’s performance is consistent across domains and in completely unseen data. Additionally, we describe the process of dictionary creation, and our use of Mechanical Turk to check dictionaries for consistency and reliability. 1.
2011. Aspect and sentiment unification model for online review analysis
- In Proceedings of the 4th International Conference of WSDM
"... User-generated reviews on the Web contain sentiments about detailed aspects of products and services. However, most of the reviews are plain text and thus require much effort to obtain information about relevant details. In this paper, we tackle the problem of automatically discovering what as-pects ..."
Abstract
-
Cited by 77 (3 self)
- Add to MetaCart
(Show Context)
User-generated reviews on the Web contain sentiments about detailed aspects of products and services. However, most of the reviews are plain text and thus require much effort to obtain information about relevant details. In this paper, we tackle the problem of automatically discovering what as-pects are evaluated in reviews and how sentiments for differ-ent aspects are expressed. We first propose Sentence-LDA (SLDA), a probabilistic generative model that assumes all words in a single sentence are generated from one aspect. We then extend SLDA to Aspect and Sentiment Unifica-tion Model (ASUM), which incorporates aspect and senti-ment together to model sentiments toward different aspects. ASUM discovers pairs of {aspect, sentiment} which we call senti-aspects. We applied SLDA and ASUM to reviews of electronic devices and restaurants. The results show that the aspects discovered by SLDA match evaluative details of the reviews, and the senti-aspects found by ASUM capture important aspects that are closely coupled with a sentiment. The results of sentiment classification show that ASUM out-performs other generative models and comes close to super-vised classification methods. One important advantage of ASUM is that it does not require any sentiment labels of the reviews, which are often expensive to obtain.
Twitter Polarity Classification with Label Propagation over Lexical Links and the Follower Graph
"... There is high demand for automated tools that assign polarity to microblog content such as tweets (Twitter posts), but this is challenging due to the terseness and informality of tweets in addition to the wide variety and rapid evolution of language in Twitter. It is thus impractical to use standard ..."
Abstract
-
Cited by 47 (0 self)
- Add to MetaCart
There is high demand for automated tools that assign polarity to microblog content such as tweets (Twitter posts), but this is challenging due to the terseness and informality of tweets in addition to the wide variety and rapid evolution of language in Twitter. It is thus impractical to use standard supervised machine learning techniques dependent on annotated training examples. We do without such annotations by using label propagation to incorporate labels from a maximum entropy classifier trained on noisy labels and knowledge about word types encoded in a lexicon, in combination with the Twitter follower graph. Results on polarity classification for several datasets show that our label propagation approach rivals a model supervised with in-domain annotated tweets, and it outperforms the noisily supervised classifier it exploits as well as a lexicon-based polarity ratio classifier. 1
The viability of web-derived polarity lexicons
- Proceedings of The 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics. ACL
, 2010
"... We examine the viability of building large polarity lexicons semi-automatically from the web. We begin by describing a graph propagation framework inspired by previous work on constructing polarity lexicons from lexical ..."
Abstract
-
Cited by 46 (3 self)
- Add to MetaCart
(Show Context)
We examine the viability of building large polarity lexicons semi-automatically from the web. We begin by describing a graph propagation framework inspired by previous work on constructing polarity lexicons from lexical
Extracting social power relationships from natural language
- In Proceedings of ACL HLT
, 2011
"... Sociolinguists have long argued that social context influences language use in all manner of ways, resulting in lects 1. This paper explores a text classification problem we will call lect modeling, an example of what has been termed computational sociolinguistics. In particular, we use machine lear ..."
Abstract
-
Cited by 25 (0 self)
- Add to MetaCart
Sociolinguists have long argued that social context influences language use in all manner of ways, resulting in lects 1. This paper explores a text classification problem we will call lect modeling, an example of what has been termed computational sociolinguistics. In particular, we use machine learning techniques to identify social power relationships between members of a social network, based purely on the content of their interpersonal communication. We rely on statistical methods, as opposed to language-specific engineering, to extract features which represent vocabulary and grammar usage indicative of social power lect. We then apply support vector machines to model the social power lects representing superior-subordinate communication in the Enron email corpus. Our results validate the treatment of lect modeling as a text classification problem – albeit a hard one – and constitute a case for future research in computational sociolinguistics. 1
New avenues in opinion mining and sentiment analysis
- Intelligent Systems, IEEE
, 2013
"... valuable, vast, and unstructured information about public opinion. Here, the history, current use, and future of opinion mining and sentiment analysis are discussed, along with relevant techniques and tools. of information were friends and special-ized magazine or websites. Now, the “social web ” pr ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
valuable, vast, and unstructured information about public opinion. Here, the history, current use, and future of opinion mining and sentiment analysis are discussed, along with relevant techniques and tools. of information were friends and special-ized magazine or websites. Now, the “social web ” provides new tools to efficiently create and share ideas with everyone connected to
G.: Automatic acquisition of lexical formality
- In: Proceedings of the 23rd International Conference on Computational Linguistics (COLING
, 2010
"... There has been relatively little work focused on determining the formality level of individual lexical items. This study applies information from large mixedgenre corpora, demonstrating that significant improvement is possible over simple word-length metrics, particularly when multiple sources of in ..."
Abstract
-
Cited by 15 (11 self)
- Add to MetaCart
(Show Context)
There has been relatively little work focused on determining the formality level of individual lexical items. This study applies information from large mixedgenre corpora, demonstrating that significant improvement is possible over simple word-length metrics, particularly when multiple sources of information, i.e. word length, word counts, and word association, are integrated. Our best hybrid system reaches 86 % accuracy on an English near-synonym formality identification task, and near perfect accuracy when comparing words with extreme formality differences. We also test our word association method in Chinese, a language where word length is not an appropriate metric for formality. 1
Learning Attitudes and Attributes from Multi-Aspect Reviews
"... Abstract—The majority of online reviews consist of plaintext feedback together with a single numeric score. However, there are multiple dimensions to products and opinions, and understanding the ‘aspects ’ that contribute to users ’ ratings may help us to better understand their individual preferenc ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
(Show Context)
Abstract—The majority of online reviews consist of plaintext feedback together with a single numeric score. However, there are multiple dimensions to products and opinions, and understanding the ‘aspects ’ that contribute to users ’ ratings may help us to better understand their individual preferences. For example, a user’s impression of an audiobook presumably depends on aspects such as the story and the narrator, and knowing their opinions on these aspects may help us to recommend better products. In this paper, we build models for rating systems in which such dimensions are explicit, in the sense that users leave separate ratings for each aspect of a product. By introducing new corpora consisting of five million reviews, rated with between three and six aspects, we evaluate our models on three prediction tasks: First, we use our model to uncover which parts of a review discuss which of the rated aspects. Second, we use our model to summarize reviews, which for us means finding the sentences that best explain a user’s rating. Finally, since aspect ratings are optional in many of the datasets we consider, we use our model to recover those ratings that are missing from a user’s evaluation. Our model matches state-of-the-art approaches on existing small-scale datasets, while scaling to the real-world datasets we introduce. Moreover, our model is able to ‘disentangle ’ content and sentiment words: we automatically learn content words that are indicative of a particular aspect as well as the aspect-specific sentiment words that are indicative of a particular rating. Keywords-machine learning; segmentation; summarization; sentiment analysis I.
Avaya: Sentiment analysis on twitter with self-training and polarity lexicon expansion.
- In Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval
, 2013
"... Abstract This paper describes the systems submitted by Avaya Labs (AVAYA) to SemEval-2013 Task 2 -Sentiment Analysis in Twitter. For the constrained conditions of both the message polarity classification and contextual polarity disambiguation subtasks, our approach centers on training high-dimensio ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
(Show Context)
Abstract This paper describes the systems submitted by Avaya Labs (AVAYA) to SemEval-2013 Task 2 -Sentiment Analysis in Twitter. For the constrained conditions of both the message polarity classification and contextual polarity disambiguation subtasks, our approach centers on training high-dimensional, linear classifiers with a combination of lexical and syntactic features. The constrained message polarity model is then used to tag nearly half a million unlabeled tweets. These automatically labeled data are used for two purposes: 1) to discover prior polarities of words and 2) to provide additional training examples for self-training. Our systems performed competitively, placing in the top five for all subtasks and data conditions. More importantly, these results show that expanding the polarity lexicon and augmenting the training data with unlabeled tweets can yield improvements in precision and recall in classifying the polarity of non-neutral messages and contexts.