Results 1 - 10
of
11,617
Selection of relevant features and examples in machine learning
- ARTIFICIAL INTELLIGENCE
, 1997
"... In this survey, we review work in machine learning on methods for handling data sets containing large amounts of irrelevant information. We focus on two key issues: the problem of selecting relevant features, and the problem of selecting relevant examples. We describe the advances that have been mad ..."
Abstract
-
Cited by 606 (2 self)
- Add to MetaCart
made on these topics in both empirical and theoretical work in machine learning, and we present a general framework that we use to compare different methods. We close with some challenges for future work in this area.
Thumbs up? Sentiment Classification using Machine Learning Techniques
- IN PROCEEDINGS OF EMNLP
, 2002
"... We consider the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative. Using movie reviews as data, we find that standard machine learning techniques definitively outperform human-produced baselines. However, the three mac ..."
Abstract
-
Cited by 1101 (7 self)
- Add to MetaCart
We consider the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative. Using movie reviews as data, we find that standard machine learning techniques definitively outperform human-produced baselines. However, the three
Learning Topic Hierarchies for
"... Existing studies have utilized Wikipedia for various knowledge acquisition tasks. However, no attempts have been made to explore multi-level topic knowledge con-tained in Wikipedia articles ’ Contents ta-bles. The articles with similar subjects are grouped together into Wikipedia cat-egories. In thi ..."
Abstract
- Add to MetaCart
Existing studies have utilized Wikipedia for various knowledge acquisition tasks. However, no attempts have been made to explore multi-level topic knowledge con-tained in Wikipedia articles ’ Contents ta-bles. The articles with similar subjects are grouped together into Wikipedia cat
Survey of clustering algorithms
- IEEE TRANSACTIONS ON NEURAL NETWORKS
, 2005
"... Data analysis plays an indispensable role for understanding various phenomena. Cluster analysis, primitive exploration with little or no prior knowledge, consists of research developed across a wide variety of communities. The diversity, on one hand, equips us with many tools. On the other hand, the ..."
Abstract
-
Cited by 499 (4 self)
- Add to MetaCart
, the profusion of options causes confusion. We survey clustering algorithms for data sets appearing in statistics, computer science, and machine learning, and illustrate their applications in some benchmark data sets, the traveling salesman problem, and bioinformatics, a new field attracting intensive efforts
Combining Background Knowledge and Learned Topics
"... Statistical topic models provide a general data-driven framework for automated discovery of high-level knowledge from large collections of text documents. While topic models can potentially discover a broad range of themes in a data set, the interpretability of the learned topics is not always ideal ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Statistical topic models provide a general data-driven framework for automated discovery of high-level knowledge from large collections of text documents. While topic models can potentially discover a broad range of themes in a data set, the interpretability of the learned topics is not always
Data Mining: An Overview from Database Perspective
- IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
, 1996
"... Mining information and knowledge from large databases has been recognized by many researchers as a key research topic in database systems and machine learning, and by many industrial companies as an important area with an opportunity of major revenues. Researchers in many different fields have sh ..."
Abstract
-
Cited by 532 (26 self)
- Add to MetaCart
Mining information and knowledge from large databases has been recognized by many researchers as a key research topic in database systems and machine learning, and by many industrial companies as an important area with an opportunity of major revenues. Researchers in many different fields have
Statistical pattern recognition: A review
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2000
"... The primary goal of pattern recognition is supervised or unsupervised classification. Among the various frameworks in which pattern recognition has been traditionally formulated, the statistical approach has been most intensively studied and used in practice. More recently, neural network techniques ..."
Abstract
-
Cited by 1035 (30 self)
- Add to MetaCart
techniques and methods imported from statistical learning theory have bean receiving increasing attention. The design of a recognition system requires careful attention to the following issues: definition of pattern classes, sensing environment, pattern representation, feature extraction and selection
Learning Topics and Positions from Debatepedia
"... We explore Debatepedia, a community-authored encyclopedia of sociopolitical de-bates, as evidence for inferring a low-dimensional, human-interpretable representa-tion in the domain of issues and positions. We introduce a generative model positing latent topics and cross-cutting positions that gives ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
We explore Debatepedia, a community-authored encyclopedia of sociopolitical de-bates, as evidence for inferring a low-dimensional, human-interpretable representa-tion in the domain of issues and positions. We introduce a generative model positing latent topics and cross-cutting positions that gives
Learning Topic Representation for SMT with Neural Networks∗
"... Statistical Machine Translation (SMT) usually utilizes contextual information to disambiguate translation candidates. However, it is often limited to contexts within sentence boundaries, hence broader topical information cannot be leveraged. In this paper, we propose a novel approach to learning top ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Statistical Machine Translation (SMT) usually utilizes contextual information to disambiguate translation candidates. However, it is often limited to contexts within sentence boundaries, hence broader topical information cannot be leveraged. In this paper, we propose a novel approach to learning
Learning topic models – going beyond SVD
- In 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science
, 2012
"... ar ..."
Results 1 - 10
of
11,617