• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Holistic sentiment analysis across languages: Multilingual supervised latent Dirichlet allocation (2010)

by J Boyd-Graber, P Resnik
Venue:In EMNLP
Add To MetaCart

Tools

Sorted by:
Results 1 - 4 of 4

Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora

by Bin Lu, Chenhao Tan, Claire Cardie, Benjamin K. Tsou
"... Most previous work on multilingual sentiment analysis has focused on methods to adapt sentiment resources from resource-rich languages to resource-poor languages. We present a novel approach for joint bilingual sentiment classification at the sentence level that augments available labeled data in ea ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
Most previous work on multilingual sentiment analysis has focused on methods to adapt sentiment resources from resource-rich languages to resource-poor languages. We present a novel approach for joint bilingual sentiment classification at the sentence level that augments available labeled data in each language with unlabeled parallel data. We rely on the intuition that the sentiment labels for parallel sentences should be similar and present a model that jointly learns improved monolingual sentiment classifiers for each language. Experiments on multiple data sets show that the proposed approach (1) outperforms the monolingual baselines, significantly improving the accuracy for both languages by 3.44%-8.12%; (2) outperforms two standard approaches for leveraging unlabeled data; and (3) produces (albeit smaller) performance gains when employing pseudo-parallel data from machine translation engines. 1

Mr. LDA: A Flexible Large Scale Topic Modeling Package using Variational Inference in MapReduce

by Ke Zhai, Jordan Boyd-graber, Mohamad Alkhouja, Nima Asadi
"... Latent Dirichlet Allocation (LDA) is a popular topic modeling technique for exploring document collections. Because of the increasing prevalence of large datasets, there is a need to improve the scalability of inference for LDA. In this paper, we introduce a novel and flexible large scale topic mode ..."
Abstract - Add to MetaCart
Latent Dirichlet Allocation (LDA) is a popular topic modeling technique for exploring document collections. Because of the increasing prevalence of large datasets, there is a need to improve the scalability of inference for LDA. In this paper, we introduce a novel and flexible large scale topic modeling package in MapReduce (Mr. LDA). As opposed to other techniques which use Gibbs sampling, our proposed framework uses variational inference, which easily fits into a distributed environment. More importantly, this variational implementation, unlike highly tuned and specialized implementations based on Gibbs sampling, is easily extensible. We demonstrate two extensions of the models possible with this scalable framework: informed priors to guide topic discovery and extracting topics from a multilingual corpus. We compare the scalability of Mr. LDA against Mahout, an existing large scale topic modeling package. Mr. LDA out-performs Mahout both in execution speed and held-out likelihood. Categories and Subject Descriptors H.3.3 [Information Search and Retrieval]: Clustering—topic models,

Table of Contents Preface..............................................................................................5 Sentiment Analysis: A Fascinating Problem...................................7

by Bing Liu , 2012
"... ..."
Abstract - Add to MetaCart
Abstract not found

Topic Models for Taxonomies

by Anton Bakalov, Andrew Mccallum, Hanna Wallach, David Mimno
"... Concept taxonomies such as MeSH, the ACM Computing Classification System, and the NY Times Subject Headings are frequently used to help organize data. They typically consist of a set of concept names organized in a hierarchy. However, these names and structure are often not sufficient to fully captu ..."
Abstract - Add to MetaCart
Concept taxonomies such as MeSH, the ACM Computing Classification System, and the NY Times Subject Headings are frequently used to help organize data. They typically consist of a set of concept names organized in a hierarchy. However, these names and structure are often not sufficient to fully capture the intended meaning of a taxonomy node, and particularly non-experts may have difficulty navigating and placing data into the taxonomy. This paper introduces two semi-supervised topic models that automatically augment a given taxonomy with many additional keywords by leveraging a corpus of multi-labeled documents. Our experiments show that users find the topics beneficial for taxonomy interpretation, substantially increasing their cataloging accuracy. Furthermore, the models provide a better information rate compared to Labeled LDA [7].
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University