Results 1 - 10
of
14
Enterprise modeling
, 1998
"... ... This article motivates the need for enterprise models and introduces the concepts of generic and deductive enterprise models. It reviews research to date on enterprise modeling and considers in detail the Toronto virtual enterprise effort at the University of Toronto. ..."
Abstract
-
Cited by 109 (5 self)
- Add to MetaCart
... This article motivates the need for enterprise models and introduces the concepts of generic and deductive enterprise models. It reviews research to date on enterprise modeling and considers in detail the Toronto virtual enterprise effort at the University of Toronto.
Active learning literature survey
, 2010
"... The key idea behind active learning is that a machine learning algorithm can achieve greater accuracy with fewer labeled training instances if it is allowed to choose the data from which is learns. An active learner may ask queries in the form of unlabeled instances to be labeled by an oracle (e.g., ..."
Abstract
-
Cited by 49 (1 self)
- Add to MetaCart
The key idea behind active learning is that a machine learning algorithm can achieve greater accuracy with fewer labeled training instances if it is allowed to choose the data from which is learns. An active learner may ask queries in the form of unlabeled instances to be labeled by an oracle (e.g., a human annotator). Active learning is well-motivated in many modern machine learning problems, where unlabeled data may be abundant but labels are difficult, time-consuming, or expensive to obtain. This report provides a general introduction to active learning and a survey of the literature. This includes a discussion of the scenarios in which queries can be formulated, and an overview of the query strategy frameworks proposed in the literature to date. An analysis of the empirical and theoretical evidence for active learning, a summary of several problem setting variants, and a discussion
Stacking Dependency Parsers
"... We explore a stacked framework for learning to predict dependency structures for natural language sentences. A typical approach in graph-based dependency parsing has been to assume a factorized model, where local features are used but a global function is optimized (McDonald et al., 2005b). Recently ..."
Abstract
-
Cited by 27 (1 self)
- Add to MetaCart
We explore a stacked framework for learning to predict dependency structures for natural language sentences. A typical approach in graph-based dependency parsing has been to assume a factorized model, where local features are used but a global function is optimized (McDonald et al., 2005b). Recently Nivre and McDonald (2008) used the output of one dependency parser to provide features for another. We show that this is an example of stacked learning, in which a second predictor is trained to improve the performance of the first. Further, we argue that this technique is a novel way of approximating rich non-local features in the second parser, without sacrificing efficient, model-optimal prediction. Experiments on twelve languages show that stacking transition-based and graphbased parsers improves performance over existing state-of-the-art dependency parsers. 1
Learning with Compositional Semantics as Structural Inference for Subsentential Sentiment Analysis
"... Determining the polarity of a sentimentbearing expression requires more than a simple bag-of-words approach. In particular, words or constituents within the expression can interact with each other to yield a particular overall polarity. In this paper, we view such subsentential interactions in light ..."
Abstract
-
Cited by 19 (4 self)
- Add to MetaCart
Determining the polarity of a sentimentbearing expression requires more than a simple bag-of-words approach. In particular, words or constituents within the expression can interact with each other to yield a particular overall polarity. In this paper, we view such subsentential interactions in light of compositional semantics, and present a novel learningbased approach that incorporates structural inference motivated by compositional semantics into the learning procedure. Our experiments show that (1) simple heuristics based on compositional semantics can perform better than learning-based methods that do not incorporate compositional semantics (accuracy of 89.7 % vs. 89.1%), but (2) a method that integrates compositional semantics into learning performs better than all other alternatives (90.7%). We also find that “contentword negators”, not widely employed in previous work, play an important role in determining expression-level polarity. Finally, in contrast to conventional wisdom, we find that expression-level classification accuracy uniformly decreases as additional, potentially disambiguating, context is considered. 1
Approximate Inference by Compilation to Arithmetic Circuits
"... Arithmetic circuits (ACs) exploit context-specific independence and determinism to allow exact inference even in networks with high treewidth. In this paper, we introduce the first ever approximate inference methods using ACs, for domains where exact inference remains intractable. We propose and eva ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Arithmetic circuits (ACs) exploit context-specific independence and determinism to allow exact inference even in networks with high treewidth. In this paper, we introduce the first ever approximate inference methods using ACs, for domains where exact inference remains intractable. We propose and evaluate a variety of techniques based on exact compilation, forward sampling, AC structure learning, Markov network parameter learning, variational inference, and Gibbs sampling. In experiments on eight challenging real-world domains, we find that the methods based on sampling and learning work best: one such method (AC 2-F) is faster and usually more accurate than loopy belief propagation, mean field, and Gibbs sampling; another (AC 2-G) has a running time similar to Gibbs sampling but is consistently more accurate than all baselines. 1
Natural language processing (almost) from scratch. arXiv:1103.0398v1
, 2011
"... We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entity recognition, and semantic role labeling. This versatility is achieved by trying to avoid task-specific eng ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entity recognition, and semantic role labeling. This versatility is achieved by trying to avoid task-specific engineering and therefore disregarding a lot of prior knowledge. Instead of exploiting man-made input features carefully optimized for each task, our system learns internal representations on the basis of vast amounts of mostly unlabeled training data. This work is then used as a basis for building a freely available tagging system with good performance and minimal computational requirements.
Pointwise Prediction for Robust, Adaptable Japanese Morphological Analysis
"... We present a pointwise approach to Japanese morphological analysis (MA) that ignores structure information during learning and tagging. Despite the lack of structure, it is able to outperform the current state-of-the-art structured approach for Japanese MA, and achieves accuracy similar to that of s ..."
Abstract
- Add to MetaCart
We present a pointwise approach to Japanese morphological analysis (MA) that ignores structure information during learning and tagging. Despite the lack of structure, it is able to outperform the current state-of-the-art structured approach for Japanese MA, and achieves accuracy similar to that of structured predictors using the same feature set. We also find that the method is both robust to outof-domain data, and can be easily adapted through the use of a combination of partial annotation and active learning. 1
www.lti.cs.cmu.edu Recall-Oriented Learning for Named Entity Recognition in Wikipedia
"... We consider the problem of NER in Arabic Wikipedia, a semi-supervised domain adaptation setting for which we have no labeled training data in the target domain. To facilitate evaluation, we obtain annotations for articles in four topical groups, allowing annotators to identify domain-specific entity ..."
Abstract
- Add to MetaCart
We consider the problem of NER in Arabic Wikipedia, a semi-supervised domain adaptation setting for which we have no labeled training data in the target domain. To facilitate evaluation, we obtain annotations for articles in four topical groups, allowing annotators to identify domain-specific entity types in addition to standard categories. Standard supervised learning on newswire text leads to poor target-domain recall. We train a sequence model and show that a simple modification to the online learner—a loss function encouraging it to “arrogantly ” favor recall over precision—substantially improves recall and F1. We then employ self-training on unlabeled target-domain data in order to adapt our model; enforcing the same recall-oriented bias in the self-training stage yields additional gains. 1
Recall-Oriented Learning of Named Entities in Arabic Wikipedia
"... We consider the problem of NER in Arabic Wikipedia, a semisupervised domain adaptation setting for which we have no labeled training data in the target domain. To facilitate evaluation, we obtain annotations for articles in four topical groups, allowing annotators to identify domain-specific entity ..."
Abstract
- Add to MetaCart
We consider the problem of NER in Arabic Wikipedia, a semisupervised domain adaptation setting for which we have no labeled training data in the target domain. To facilitate evaluation, we obtain annotations for articles in four topical groups, allowing annotators to identify domain-specific entity types in addition to standard categories. Standard supervised learning on newswire text leads to poor target-domain recall. We train a sequence model and show that a simple modification to the online learner—a loss function encouraging it to “arrogantly ” favor recall over precision— substantially improves recall and F1. We then adapt our model with self-training on unlabeled target-domain data; enforcing the same recall-oriented bias in the selftraining stage yields marginal gains. 1 1
Kernel Slicing: Scalable Online Training with Conjunctive Features
"... This paper proposes an efficient online method that trains a classifier with many conjunctive features. We employ kernel computation called kernel slicing, which explicitly considers conjunctions among frequent features in computing the polynomial kernel, to combine the merits of linear and kernel-b ..."
Abstract
- Add to MetaCart
This paper proposes an efficient online method that trains a classifier with many conjunctive features. We employ kernel computation called kernel slicing, which explicitly considers conjunctions among frequent features in computing the polynomial kernel, to combine the merits of linear and kernel-based training. To improve the scalability of this training, we reuse the temporal margins of partial feature vectors and terminate unnecessary margin computations. Experiments on dependency parsing and hyponymy-relation extraction demonstrated that our method could train a classifier orders of magnitude faster than kernel-based online learning, while retaining its space efficiency. 1

