Results 1 -
5 of
5
Sequential Learning of Classifiers for Structured Prediction Problems
"... Many classification problems with structured outputs can be regarded as a set of interrelated sub-problems where constraints dictate valid variable assignments. The standard approaches to these problems include either independent learning of individual classifiers for each of the sub-problems or joi ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Many classification problems with structured outputs can be regarded as a set of interrelated sub-problems where constraints dictate valid variable assignments. The standard approaches to these problems include either independent learning of individual classifiers for each of the sub-problems or joint learning of the entire set of classifiers with the constraints enforced during learning. We propose an intermediate approach where we learn these classifiers in a sequence using previously learned classifiers to guide learning of the next classifier by enforcing constraints between their outputs. We provide a theoretical motivation to explain why this learning protocol is expected to outperform both alternatives when individual problems have different ‘complexity’. This analysis motivates an algorithm for choosing a preferred order of classifier learning. We evaluate our technique on artificial experiments and on the entity and relation identification problem where the proposed method outperforms both joint and independent learning. 1
Joint Training and Decoding Using Virtual Nodes for Cascaded Segmentation and Tagging Tasks
"... Many sequence labeling tasks in NLP require solving a cascade of segmentation and tagging subtasks, such as Chinese POS tagging, named entity recognition, and so on. Traditional pipeline approaches usually suffer from error propagation. Joint training/decoding in the cross-product state space could ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Many sequence labeling tasks in NLP require solving a cascade of segmentation and tagging subtasks, such as Chinese POS tagging, named entity recognition, and so on. Traditional pipeline approaches usually suffer from error propagation. Joint training/decoding in the cross-product state space could cause too many parameters and high inference complexity. In this paper, we present a novel method which integrates graph structures of two subtasks into one using virtual nodes, and performs joint training and decoding in the factorized state space. Experimental evaluations on CoNLL 2000 shallow parsing data set and Fourth SIGHAN Bakeoff CTB POS tagging data set demonstrate the superiority of our method over cross-product, pipeline and candidate reranking approaches. 1
Center for Language and Speech Processing, Applied Physics Lab
"... Domain adaptation, the problem of adapting a natural language processing system trained in one domain to perform well in a different domain, has received significant attention. This paper addresses an important problem for deployed systems that has received little attention – detecting when such ada ..."
Abstract
- Add to MetaCart
Domain adaptation, the problem of adapting a natural language processing system trained in one domain to perform well in a different domain, has received significant attention. This paper addresses an important problem for deployed systems that has received little attention – detecting when such adaptation is needed by a system operating in the wild, i.e., performing classification over a stream of unlabeled examples. Our method uses A-distance, a metric for detecting shifts in data streams, combined with classification margins to detect domain shifts. We empirically show effective domain shift detection on a variety of data sets and shift conditions. 1
INTERACTIVE LEARNING PROTOCOLS FOR NATURAL LANGUAGE APPLICATIONS
, 2009
"... Statistical machine learning has become an integral technology for solving many informatics applications. In particular, corpus-based statistical techniques have emerged as the dominant paradigm for core natural language processing (NLP) tasks such as parsing, machine translation, and information ex ..."
Abstract
- Add to MetaCart
Statistical machine learning has become an integral technology for solving many informatics applications. In particular, corpus-based statistical techniques have emerged as the dominant paradigm for core natural language processing (NLP) tasks such as parsing, machine translation, and information extraction, amongst others. However, while supervised machine learning is well understood, its successful application to practical scenarios is predicated on obtaining large annotated corpora and performing significant feature engineering, both notably expensive undertakings. Interactive learning protocols offer one promising solution for reducing these costs by allowing the learner and domain expert to interact during learning in an effort to both reduce sample complexity and improve system performance. By specifying a method where the learner may request targeted information, the domain expert is focused on providing the most useful information. This work formalizes a general framework for interactive learning and examines two interactive learning protocols with particular attention to natural language scenarios. We first examine active learning for structured output spaces, the scenario where there are multiple predictions which must be composed into a structurally coherent global prediction. Secondly, we examine active learning for pipeline models, where a complex prediction is decomposed into a sequence of predictions
Part-of-Speech Tagging for Chinese-English Mixed Texts with Dynamic Features
"... In modern Chinese articles or conversations, it is very popular to involve a few English words, especially in emails and Internet literature. Therefore, it becomes an important and challenging topic to analyze Chinese-English mixed texts. The underlying problem is how to tag part-of-speech (POS) for ..."
Abstract
- Add to MetaCart
In modern Chinese articles or conversations, it is very popular to involve a few English words, especially in emails and Internet literature. Therefore, it becomes an important and challenging topic to analyze Chinese-English mixed texts. The underlying problem is how to tag part-of-speech (POS) for the English words involved. Due to the lack of specially annotated corpus, most of the English words are tagged as the oversimplified type, “foreign words”. In this paper, we present a method using dynamic features to tag POS of mixed texts. Experiments show that our method achieves higher performance than traditional sequence labeling methods. Meanwhile, our method also boosts the performance of POS tagging for pure Chinese texts. 1

