Results 11 - 20
of
44
The importance of syntactic parsing and inference in semantic role labeling
- COMPUTATIONAL LINGUISTICS
, 2008
"... We present a general framework for semantic role labeling. The framework combines a machine learning technique with an integer linear programming based inference procedure, which incorporates linguistic and structural constraints into a global decision process. Within this framework, we study the ro ..."
Abstract
-
Cited by 28 (13 self)
- Add to MetaCart
We present a general framework for semantic role labeling. The framework combines a machine learning technique with an integer linear programming based inference procedure, which incorporates linguistic and structural constraints into a global decision process. Within this framework, we study the role of syntactic parsing information in semantic role labeling. We show that full syntactic parsing information is, by far, most relevant in identifying the argument, especially, in the very first stage—the pruning stage. Surprisingly, the quality of the pruning stage cannot be solely determined based on its recall and precision. Instead, it depends on the characteristics of the output candidates that determine the difficulty of the downstream problems. Motivated by this observation, we propose an effective and simple approach of combining different semantic role labeling systems through joint inference, which significantly improves its performance. Our system has been evaluated in the CoNLL-2005 shared task on semantic role labeling, and achieves the highest F1 score among 19 participants.
Learning to Recognize 3D Objects with SNoW
, 2000
"... This paper describes a novel view-based learning algorithm for 3D object recognition from 2D images using a network of linear units. The SNoW learning architecture is a sparse network of linear functions over a pre-dened or incrementally learned feature space and is specically tailored for learni ..."
Abstract
-
Cited by 23 (3 self)
- Add to MetaCart
This paper describes a novel view-based learning algorithm for 3D object recognition from 2D images using a network of linear units. The SNoW learning architecture is a sparse network of linear functions over a pre-dened or incrementally learned feature space and is specically tailored for learning in the presence of a very large number of features. We use pixel-based and edge-based representations in large scale object recognition experiments in which the performance of SNoW is compared with that of Support Vector Machines (SVMs) and nearest neighbor using the 100 objects in the Columbia Image Object Database (COIL-100). Experimental results show that the SNoW-based method outperforms the SVM-based system in terms of recognition rate and the computational cost involved in learning. Most importantly, SNoW's performance degrades more gracefully when the training data contains fewer views. The empirical results also provide insight into practical and theoretical consideratio...
Linear Concepts and Hidden Variables
, 2000
"... We study a learning problem which allows for a \fair" comparison between unsupervised learning methods|probabilistic model construction, and more traditional algorithms that directly learn a classication. The merits of each approach are intuitively clear: inducing a model is more expensive comput ..."
Abstract
-
Cited by 21 (15 self)
- Add to MetaCart
We study a learning problem which allows for a \fair" comparison between unsupervised learning methods|probabilistic model construction, and more traditional algorithms that directly learn a classication. The merits of each approach are intuitively clear: inducing a model is more expensive computationally, but may support a wider range of predictions. Its performance, however, will depend on how well the postulated probabilistic model ts that data. To compare the paradigms we consider a model which postulates a single binary-valued hidden variable on which all other attributes depend. In this model, nding the most likely value of any one variable (given known values for the others) reduces to testing a linear function of the observed values. We learn the model with two techniques: the standard EM algorithm, and a new algorithm we develop based on covariances. We compare these, in a controlled fashion, against an algorithm (a version of Winnow) that attempts to nd a good l...
Relational Representations that Facilitate Learning
, 2000
"... Given a collection of objects in the world, along with some relations that hold among them, a fundamental problem is how to learn denitions of some relations and concepts of interest in terms of the given relations. These denitions might be quite complex and, inevitably, might require the use ..."
Abstract
-
Cited by 21 (9 self)
- Add to MetaCart
Given a collection of objects in the world, along with some relations that hold among them, a fundamental problem is how to learn denitions of some relations and concepts of interest in terms of the given relations. These denitions might be quite complex and, inevitably, might require the use of quanti- ed expressions. Attempts to use rst order languages for these purposes are hampered by the fact that relational inference is intractable and, consequently, so is the problem of learning relational denitions. This work develops an expressive relational representation language that allows the use of propositional learning algorithms when learning relational denitions. The representation serves as an intermediate level between a raw description of observations in the world and a propositional learning system that attempts to learn denitions for concepts and relations. It allows for hierarchical composition of relational expressions that can be evaluated ecientl...
Scaling Up Context-Sensitive Text Correction
, 2001
"... The main challenge in an effort to build a realistic system with context-sensitive inference capabilities, beyond accuracy, is scalability. This paper studies this problem in the context of a learning-based approach to context sensitive text correction -- the task of fixing spelling errors that resu ..."
Abstract
-
Cited by 21 (8 self)
- Add to MetaCart
The main challenge in an effort to build a realistic system with context-sensitive inference capabilities, beyond accuracy, is scalability. This paper studies this problem in the context of a learning-based approach to context sensitive text correction -- the task of fixing spelling errors that result in valid words, such as substituting to for too, casual for causal, and so on. Research papers on this problem have developed algorithms that can achieve fairly high accuracy, in many cases over 90%. However, this level of performance is not sufficient for a large coverage practical system since it implies a low sentence level performance. We examine and offer solutions to several issues relating to scaling up a context sensitive text correction system. In particular, we suggest methods to reduce the memory requirements while maintaining a high level of performance and show that this can still allow the system to adapt to new domains. Most important, we show how to significantly increase the coverage of the system to realistic levels, while providing a very high level of performance, at the 99% level.
SNoW User Guide
, 1999
"... this document, but the best starting place for learning to use the system is the tutorial. The tutorial gives a good sense of the required steps for using the system. Once a user is comfortable with the default method of using the system, the more detailed description of the command line options giv ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
this document, but the best starting place for learning to use the system is the tutorial. The tutorial gives a good sense of the required steps for using the system. Once a user is comfortable with the default method of using the system, the more detailed description of the command line options given in Chapter 5 may be more useful. 1
Learning with feature description logics
- Proceedings of the 12th International Conference on Inductive Logic Programming
, 2002
"... Abstract. We present a paradigm for efficient learning and inference with relational data using propositional means. The paradigm utilizes description logics and concepts graphs in the service of learning relational models using efficient propositional learning algorithms. We introduce a Feature Des ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
Abstract. We present a paradigm for efficient learning and inference with relational data using propositional means. The paradigm utilizes description logics and concepts graphs in the service of learning relational models using efficient propositional learning algorithms. We introduce a Feature Description Logic (FDL)- a relational (frame based) language that supports efficient inference, along with a generation function that uses inference with descriptions in the FDL to produce features suitable for use by learning algorithms. These are used within a learning framework that is shown to learn efficiently and accurately relational representations in terms of the FDL descriptions. The paradigm was designed to support learning in domains that are relational but where the amount of data and size of representation learned are very large; we exemplify it here, for clarity, on the classical ILP task of learning family relations. This paradigm provides a natural solution to the problem of learning and representing relational data; it extends and unifies several lines of works in KRR and Machine Learning in ways that provide hope for a coherent usage of learning and reasoning methods in large scale intelligent inference. 1
Transliteration as constrained optimization
- In Proc. EMNLP
, 2008
"... This paper introduces a new method for identifying named-entity (NE) transliterations in bilingual corpora. Recent works have shown the advantage of discriminative approaches to transliteration: given two strings (ws, wt) in the source and target language, a classifier is trained to determine if wt ..."
Abstract
-
Cited by 14 (3 self)
- Add to MetaCart
This paper introduces a new method for identifying named-entity (NE) transliterations in bilingual corpora. Recent works have shown the advantage of discriminative approaches to transliteration: given two strings (ws, wt) in the source and target language, a classifier is trained to determine if wt is the transliteration of ws. This paper shows that the transliteration problem can be formulated as a constrained optimization problem and thus take into account contextual dependencies and constraints among character bi-grams in the two strings. We further explore several methods for learning the objective function of the optimization problem and show the advantage of learning it discriminately. Our experiments show that the new framework results in over 50 % improvement in translating English NEs to Hebrew. 1
Gene recognition based on DAG shortest paths
, 2001
"... We describe DAGGER, an ab initio gene recognition program which combines the output of high dimensional signal sensors in an intuitive gene model based on directed acyclic graphs. In the first stage, candidate start, donor, acceptor, and stop sites are scored using the SNoW learning architecture. Th ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
We describe DAGGER, an ab initio gene recognition program which combines the output of high dimensional signal sensors in an intuitive gene model based on directed acyclic graphs. In the first stage, candidate start, donor, acceptor, and stop sites are scored using the SNoW learning architecture. These sites are then used to generate a directed acyclic graph in which each sourcesink path represents a possible gene structure. Training sequences are used to optimize an edge weighting function so that the shortest source-sink path maximizes exon-level prediction accuracy. Experimental evaluation of prediction accuracy on two benchmark data sets demonstrates that DAGGER is competitive with ab initio gene finding programs based on Hidden Markov Models. Contact: jsc@ocf.berkeley.edu
Generating Confusion Sets for Context-Sensitive Error Correction
"... In this paper, we consider the problem of generating candidate corrections for the task of correcting errors in text. We focus on the task of correcting errors in preposition usage made by non-native English speakers, using discriminative classifiers. The standard approach to the problem assumes tha ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
In this paper, we consider the problem of generating candidate corrections for the task of correcting errors in text. We focus on the task of correcting errors in preposition usage made by non-native English speakers, using discriminative classifiers. The standard approach to the problem assumes that the set of candidate corrections for a preposition consists of all preposition choices participating in the task. We determine likely preposition confusions using an annotated corpus of nonnative text and use this knowledge to produce smaller sets of candidates. We propose several methods of restricting candidate sets. These methods exclude candidate prepositions that are not observed as valid corrections in the annotated corpus and take into account the likelihood of each preposition confusion in the non-native text. We find that restricting candidates to those that are observed in the non-native data improves both the precision and the recall compared to the approach that views all prepositions as possible candidates. Furthermore, the approach that takes into account the likelihood of each preposition confusion is shown to be the most effective. 1

