Results 1 - 10
of
13
A Machine Learning Approach to Sentiment Analysis in Multilingual Web Texts
- Information Retrieval
, 2009
"... Sentiment analysis, also called opinion mining, is a form of information extraction from text of growing research and commercial interest. In this paper we present our machine learning experiments with regard to sentiment analysis in blog, review and forum texts found on the World Wide Web and writt ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
(Show Context)
Sentiment analysis, also called opinion mining, is a form of information extraction from text of growing research and commercial interest. In this paper we present our machine learning experiments with regard to sentiment analysis in blog, review and forum texts found on the World Wide Web and written in English, Dutch and French. We train from a set of example sentences or statements that are manually annotated as positive, negative or neutral with regard to a certain entity. We are interested in the feelings that people express with regard to certain consumption products. We learn and evaluate several classification models that can be configured in a cascaded pipeline. We have to deal with several problems, being the noisy character of the input texts, the attribution of the sentiment to a particular entity and the small size of the training set. We succeed to identify positive, negative and neutral feelings to the entity under consideration with ca. 83 % accuracy for English texts based on unigram features augmented with linguistic features. The accuracy results of processing the Dutch and French texts are ca. 70 % and 68 % respectively due to the larger variety of the linguistic expressions that more often diverge from standard language, thus demanding more training patterns. In addition, our experiments give us insights into the portability of the learned models across domains and languages. A substantial part of the article investigates the role of active learning techniques for reducing the number of examples to be manually annotated. Keywords Opinion mining – information tracking – cross-language learning – active learning 1
Weakly supervised techniques for domain-independent sentiment classification
- In Proceedings of the 1st International CIKM Workshop on Topic-Sentiment Analysis for Mass Opinion Measurement,, Hong Kong
, 2009
"... An important sub-task of sentiment analysis is polarity clas-sification, in which text is classified as being positive or neg-ative. Supervised machine learning techniques can perform this task very effectively. However, they require a large cor-pus of training data, and a number of studies have dem ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
(Show Context)
An important sub-task of sentiment analysis is polarity clas-sification, in which text is classified as being positive or neg-ative. Supervised machine learning techniques can perform this task very effectively. However, they require a large cor-pus of training data, and a number of studies have demon-strated that the good performance of supervised models is dependent on a good match between the training and testing data with respect to the domain, topic and time-period. Weakly-supervised techniques use a large collection of un-labelled text to determine sentiment, and so their perfor-mance may be less dependent on the domain, topic and time-period represented by the testing data. This paper presents experiments that investigate the effectiveness of word sim-ilarity techniques when performing weakly-supervised sen-timent classification. It also considers the extent to which the performance of each method is independent from the do-main, topic and time-period of the testing data. The results indicate that the word similarity techniques are suitable for applications that require sentiment classification across sev-eral domains.
Extracting Opinions and Facts for Business Intelligence
"... Abstract. Finding information about companies on multiple sources on the Web has become increasingly important for business analysts. In particular, since the emergence of the Web 2.0, opinions about companies and their services or products need to be found and distilled in order to create an accura ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
(Show Context)
Abstract. Finding information about companies on multiple sources on the Web has become increasingly important for business analysts. In particular, since the emergence of the Web 2.0, opinions about companies and their services or products need to be found and distilled in order to create an accurate picture of a business entity. Without appropriate text mining tools, company analysts would have to read hundreds of textual reports, newspaper articles, forums’ postings and manually dig out factual as well as subjective information. This paper describes a series of experiments to assess the value of a number of lexical, morpho-syntactic, and sentiment-based features derived from linguistic processing and from an existing lexical database for the classification of evaluative texts. The paper describes experiments carried out with two different web sources: one source contains positive and negative opinions while the other contains fine grain classifications in a 5-point qualitative scale. The results obtain are positive and in line with current research in the area. Our aim is to use the result of classification in a practical application that will combine factual and opinionated information in order to create the reputation of a business entity. 1
oro.open.ac.uk Exploring English Lexicon Knowledge for Chinese Sentiment Analysis
"... analysis ..."
Augmenting chinese online video recommendations by using virtual ratings predicted by review sentiment classification
- In Proceedings of the 2010 IEEE International Conference on Data Mining Workshops, ICDMW ’10
, 2010
"... Abstract—In this paper we aim to resolve the recommendation problem by using the virtual ratings in online environments when user rating information is not available. As a matter of fact, in most of current websites especially the Chinese video-sharing ones, the traditional pure rating based collabo ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Abstract—In this paper we aim to resolve the recommendation problem by using the virtual ratings in online environments when user rating information is not available. As a matter of fact, in most of current websites especially the Chinese video-sharing ones, the traditional pure rating based collaborative filtering recommender methods are not fully qualified due to the sparsity of rating data. Motivated by our prior work on the investigation of user reviews that broadly appear in such sites, we hence propose a new recommender algorithm by fusing a self-supervised emoticon-integrated sentiment classification approach, by which the missing User-Item Rating Matrix can be substituted by the virtual ratings which are predicted by decomposing user reviews as given to the items. To test the algorithm’s practical value, we have first identified the self-supervised sentiment classification’s higher performance by comparing it with a supervised approach. Moreover, we conducted a statistic evaluation method to show the effectiveness of our recommender system on improving Chinese online video recommendations ’ accuracy. Keywords-Information retrieval; sentiment analysis; opinion mining; online video recommendation. I.
Sentence-Level Subjectivity Detection Using Neuro-Fuzzy Models
"... In this work, we attempt to detect sentencelevel subjectivity by means of two supervised machine learning approaches: a Fuzzy Control System and Adaptive Neuro-Fuzzy Inference System. Even though these methods are popular in pattern recognition, they have not been thoroughly investigated for subject ..."
Abstract
- Add to MetaCart
In this work, we attempt to detect sentencelevel subjectivity by means of two supervised machine learning approaches: a Fuzzy Control System and Adaptive Neuro-Fuzzy Inference System. Even though these methods are popular in pattern recognition, they have not been thoroughly investigated for subjectivity analysis. We present a novel “Pruned ICF Weighting Coefficient, ” which improves the accuracy for subjectivity detection. Our feature extraction algorithm calculates a feature vector based on the statistical occurrences of words in a corpus without any lexical knowledge. For this reason, these machine learning models can be applied to any language; i.e., there is no lexical, grammatical, syntactical analysis used in the classification process. 1
oro.open.ac.uk Latent Sentiment Model for Weakly-Supervised Cross-Lingual Sentiment Classification
"... and other research outputs Latent sentiment model for weakly-supervised crosslingual sentiment classification Conference Item How to cite: He, Yulan (2011). Latent sentiment model for weakly-supervised cross-lingual sentiment classification. ..."
Abstract
- Add to MetaCart
(Show Context)
and other research outputs Latent sentiment model for weakly-supervised crosslingual sentiment classification Conference Item How to cite: He, Yulan (2011). Latent sentiment model for weakly-supervised cross-lingual sentiment classification.
Enhancement Bag-of-Words Model for Solving the Challenges of Sentiment Analysis
"... Abstract—Sentiment analysis is a branch of natural language processing, or machine learning methods. It becomes one of the most important sources in decision making. It can extract, identify, evaluate or otherwise characterizes from the online sentiments reviews. Although Bag-Of-Words model is the m ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—Sentiment analysis is a branch of natural language processing, or machine learning methods. It becomes one of the most important sources in decision making. It can extract, identify, evaluate or otherwise characterizes from the online sentiments reviews. Although Bag-Of-Words model is the most widely used technique for sentiment analysis, it has two major weaknesses: using a manual evaluation for a lexicon in determining the evaluation of words and analyzing sentiments with low accuracy because of neglecting the language grammar effects of the words and ignore semantics of the words. In this paper, we propose a new technique to evaluate online sentiments in one topic domain and produce a solution for some significant sentiment analysis challenges that improves the accuracy of sentiment analysis performed. The proposed technique relies on the enhancement bag-of-words model for evaluating sentiment polarity and score automatically by using the words weight instead of term frequency. This technique also can classify the reviews based on features and keywords of the scientific topic domain. This paper introduces solutions for essential sentiment analysis challenges that are suitable for the review structure. It also examines the effects by the proposed enhancement model to reach higher accuracy. Keywords—Sentiment analysis; Bag-Of-Words; sentiment analysis challenges; text analysis; Reviews I.
Recommender systems t...
"... Collaborative filtering (CF) recommenders based on User-Item rating matrix as explicitly obtained from end users have recently appeared promising in recommender systems. However, User-Item rating matrix is not always available or very sparse in some web applications, which has critical impact to the ..."
Abstract
- Add to MetaCart
Collaborative filtering (CF) recommenders based on User-Item rating matrix as explicitly obtained from end users have recently appeared promising in recommender systems. However, User-Item rating matrix is not always available or very sparse in some web applications, which has critical impact to the application of CF recommenders. In this article we aim to enhance the online recommender system by fusing virtual ratings as derived from user reviews. Specifically, taking into account of Chinese reviews ’ characteristics, we propose to fuse the self-supervised emotion-integrated sentiment classification results into CF recommenders, by which the User-Item Rating Matrix can be inferred by decomposing item reviews that users gave to the items. The main advantage of this approach is that it can extend CF recommenders to some web applications without user rating information. In the experiments, we have first identified the self-supervised sentiment classifica-tion’s higher precision and recall by comparing it with traditional classification methods.Furthermore, the classification results, as behaving as virtual ratings, were incorporated into both user-based and item-based CF algorithms. We have also conducted an experiment to evaluate the proximity between the virtual and real ratings and clarified the effectiveness of the virtual ratings. The experimental results demonstrated the significant impact of virtual ratings on increasing system’s recommendation accuracy in different data