Results 1 - 10
of
1,183
Text Classification using String Kernels
"... We propose a novel approach for categorizing text documents based on the use of a special kernel. The kernel is an inner product in the feature space generated by all subsequences of length k. A subsequence is any ordered sequence of k characters occurring in the text though not necessarily contiguo ..."
Abstract
-
Cited by 495 (7 self)
- Add to MetaCart
We propose a novel approach for categorizing text documents based on the use of a special kernel. The kernel is an inner product in the feature space generated by all subsequences of length k. A subsequence is any ordered sequence of k characters occurring in the text though not necessarily
Support Vector Machine Active Learning with Applications to Text Classification
- JOURNAL OF MACHINE LEARNING RESEARCH
, 2001
"... Support vector machines have met with significant success in numerous real-world learning tasks. However, like most machine learning algorithms, they are generally applied using a randomly selected training set classified in advance. In many settings, we also have the option of using pool-based acti ..."
Abstract
-
Cited by 735 (5 self)
- Add to MetaCart
instances to request next. We provide a theoretical motivation for the algorithm using the notion of a version space. We present experimental results showing that employing our active learning method can significantly reduce the need for labeled training instances in both the standard inductive
Thumbs up? Sentiment Classification using Machine Learning Techniques
- IN PROCEEDINGS OF EMNLP
, 2002
"... We consider the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative. Using movie reviews as data, we find that standard machine learning techniques definitively outperform human-produced baselines. However, the three mac ..."
Abstract
-
Cited by 1101 (7 self)
- Add to MetaCart
We consider the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative. Using movie reviews as data, we find that standard machine learning techniques definitively outperform human-produced baselines. However, the three
Learning realistic human actions from movies
- IN: CVPR.
, 2008
"... The aim of this paper is to address recognition of natural human actions in diverse and realistic video settings. This challenging but important subject has mostly been ignored in the past due to several problems one of which is the lack of realistic and annotated video datasets. Our first contribut ..."
Abstract
-
Cited by 738 (48 self)
- Add to MetaCart
contribution is to address this limitation and to investigate the use of movie scripts for automatic annotation of human actions in videos. We evaluate alternative methods for action retrieval from scripts and show benefits of a text-based classifier. Using the retrieved action samples for visual learning, we
A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features
- Machine Learning
, 1993
"... In the past, nearest neighbor algorithms for learning from examples have worked best in domains in which all features had numeric values. In such domains, the examples can be treated as points and distance metrics can use standard definitions. In symbolic domains, a more sophisticated treatment of t ..."
Abstract
-
Cited by 309 (3 self)
- Add to MetaCart
space. We show that this technique produces excellent classification accuracy on three problems that have been studied by machine learning researchers: predicting protein secondary structure, identifying DNA promoter sequences, and pronouncing English text. Direct experimental comparisons with the other
Summarizing Text Documents: Sentence Selection and Evaluation Metrics
- In Research and Development in Information Retrieval
, 1999
"... Human-quality text summarization systems are difficult to design, and even more difficult to evaluate, in part because documents can differ along several dimensions, such as length, writing style and lexical usage. Nevertheless, certain cues can often help suggest the selection of sentences for incl ..."
Abstract
-
Cited by 236 (7 self)
- Add to MetaCart
results showing the importance of corpus-dependent baseline summarization standards, compression ratios and carefully crafted long queries.
Hierarchical Text Classification and Evaluation
, 2001
"... Hierarchical Classification refers to assigning of one or more suitable categories from a hierarchical category space to a document. While previous work in hierarchical classification focused on virtual category trees where documents are assigned only to the leaf categories, we propose a topdown lev ..."
Abstract
-
Cited by 134 (2 self)
- Add to MetaCart
level-based classification method that can classify documents to both leaf and internal categories. As the standard performance measures assume independence between categories, they have not considered the documents incorrectly classified into categories that are similar or not far from the correct ones
2007), Real Wage Rigidities and the New Keynesian model
- Journal of Money, Credit, and Banking, supplement to
"... Most central banks perceive a trade-off between stabilizing inflation and stabi-lizing the gap between output and desired output. However, the standard new Keynesian framework implies no such trade-off. In that framework, stabilizing inflation is equivalent to stabilizing the welfare-relevant output ..."
Abstract
-
Cited by 237 (7 self)
- Add to MetaCart
Most central banks perceive a trade-off between stabilizing inflation and stabi-lizing the gap between output and desired output. However, the standard new Keynesian framework implies no such trade-off. In that framework, stabilizing inflation is equivalent to stabilizing the welfare
Tackling the Poor Assumptions of Naive Bayes Text Classifiers
- In Proceedings of the Twentieth International Conference on Machine Learning
, 2003
"... Naive Bayes is often used as a baseline in text classification because it is fast and easy to implement. Its severe assumptions make such efficiency possible but also adversely affect the quality of its results. In this paper we propose simple, heuristic solutions to some of the problems with Naive ..."
Abstract
-
Cited by 157 (5 self)
- Add to MetaCart
Naive Bayes is often used as a baseline in text classification because it is fast and easy to implement. Its severe assumptions make such efficiency possible but also adversely affect the quality of its results. In this paper we propose simple, heuristic solutions to some of the problems with Naive
Text Representations for Patent Classification
, 2013
"... With the increasing rate of patent application filings, automated patent classification is of rising economic importance. This article investigates how patent classification can be improved by using different representations of the patent documents. Using the Linguistic Classification System (LCS), ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
), we compare the impact of adding statistical phrases (in the form of bigrams) and linguistic phrases (in two different dependency formats) to the standard bag-of-words text representation on a subset of 532,264 English abstracts from the CLEF-IP 2010 corpus. In contrast to previous findings
Results 1 - 10
of
1,183