Results 21 - 30
of
61
Semi-supervised semantic role labeling
- IN PROCEEDINGS OF EACL
, 2009
"... Large scale annotated corpora are prerequisite to developing high-performance semantic role labeling systems. Unfortunately, such corpora are expensive to produce, limited in size, and may not be representative. Our work aims to reduce the annotation effort involved in creating resources for semanti ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Large scale annotated corpora are prerequisite to developing high-performance semantic role labeling systems. Unfortunately, such corpora are expensive to produce, limited in size, and may not be representative. Our work aims to reduce the annotation effort involved in creating resources for semantic role labeling via semi-supervised learning. Our algorithm augments a small number of manually labeled instances with unlabeled examples whose roles are inferred automatically via annotation projection. We formulate the projection task as a generalization of the linear assignment problem. We seek to find a role assignment in the unlabeled data such that the argument similarity between the labeled and unlabeled instances is maximized. Experimental results on semantic role labeling show that the automatic annotations produced by our method improve performance over using hand-labeled instances alone.
A Study on Dependency Tree Kernels for Automatic Extraction of Protein-Protein Interaction
"... Kernel methods are considered the most effective techniques for various relation extraction (RE) tasks as they provide higher accuracy than other approaches. In this paper, we introduce new dependency tree (DT) kernels for RE by improving on previously proposed dependency tree structures. These are ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Kernel methods are considered the most effective techniques for various relation extraction (RE) tasks as they provide higher accuracy than other approaches. In this paper, we introduce new dependency tree (DT) kernels for RE by improving on previously proposed dependency tree structures. These are further enhanced to design more effective approaches that we call mildly extended dependency tree (MEDT) kernels. The empirical results on the protein-protein interaction (PPI) extraction task on the AIMed corpus show that tree kernels based on our proposed DT structures achieve higher accuracy than previously proposed DT and phrase structure tree (PST) kernels. 1
Improved morpho-phonological sequence processing with constraint satisfaction inference
"... In performing morpho-phonological sequence processing tasks, such as letterphoneme conversion or morphological analysis, it is typically not enough to base the output sequence on local decisions that map local-context input windows to single output tokens. We present a global sequence-processing met ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
In performing morpho-phonological sequence processing tasks, such as letterphoneme conversion or morphological analysis, it is typically not enough to base the output sequence on local decisions that map local-context input windows to single output tokens. We present a global sequence-processing method that repairs inconsistent local decisions. The approach is based on local predictions of overlapping trigrams of output tokens, which open up a space of possible sequences; a data-driven constraint satisfaction inference step then searches for the optimal output sequence. We demonstrate significant improvements in terms of word accuracy on English and Dutch letter-phoneme conversion and morphological segmentation, and we provide qualitative analyses of error types prevented by the constraint satisfaction inference method. 1
Randomization Tests for Relational Learning
, 2003
"... Algorithms for relational learning and propositional learning face different statistical challenges. ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
Algorithms for relational learning and propositional learning face different statistical challenges.
Tuning Syntactically Enhanced Word Alignment for Statistical Machine Translation
"... We introduce a syntactically enhanced word alignment model that is more flexible than state-of-the-art generative word alignment models and can be tuned according to different end tasks. First of all, this model takes the advantages of both unsupervised and supervised word alignment approaches by ob ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
We introduce a syntactically enhanced word alignment model that is more flexible than state-of-the-art generative word alignment models and can be tuned according to different end tasks. First of all, this model takes the advantages of both unsupervised and supervised word alignment approaches by obtaining anchor alignments from unsupervised generative models and seeding the anchor alignments into a supervised discriminative model. Second, this model offers the flexibility of tuning the alignment according to different optimisation criteria. Our experiments show that using our word alignment in a Phrase-Based Statistical Machine Translation system yields a 5.38 % relative increase on IWSLT 2007 task in terms of BLEU score. 1
Weighted Kernel Functions for SVM Learning in String Domains: A Distance Function Viewpoint
- In Proceedings of ICMLC (International Conference on Machine Learning and Cybernetics
, 2005
"... This paper extends the idea of weighted distance functions to kernels and support vector machines. Here, we focus on applications that rely on sliding a window over a sequence of string data. For this type of problems it is argued that a symbolic, context-based representation of the data should be p ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
This paper extends the idea of weighted distance functions to kernels and support vector machines. Here, we focus on applications that rely on sliding a window over a sequence of string data. For this type of problems it is argued that a symbolic, context-based representation of the data should be preferred over a continuous, real format as this is a much more intuitive setting for working with (weighted) distance functions. It is shown how a weighted string distance can be decomposed and subsequently used in di#erent kernel functions and how these kernel functions correspond to inner products between real vectors. As a case-study named entity recognition is used with information gain ratio as a weighting scheme.
Consistent Translation using Discriminative Learning: A Translation Memory-inspired Approach ∗
"... We present a discriminative learning method to improve the consistency of translations in phrase-based Statistical Machine Translation (SMT) systems. Our method is inspired by Translation Memory (TM) systems which are widely used by human translators in industrial settings. We constrain the translat ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
We present a discriminative learning method to improve the consistency of translations in phrase-based Statistical Machine Translation (SMT) systems. Our method is inspired by Translation Memory (TM) systems which are widely used by human translators in industrial settings. We constrain the translation of an input sentence using the most similar ‘translation example ’ retrieved from the TM. Differently from previous research which used simple fuzzy match thresholds, these constraints are imposed using discriminative learning to optimise the translation performance. We observe that using this method can benefit the SMT system by not only producing consistent translations, but also improved translation outputs. We report a 0.9 point improvement in terms of BLEU score on English–Chinese technical documents. 1
BioMed Central
, 2006
"... A novel approach to phylogenetic tree construction using stochastic optimization and clustering ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
A novel approach to phylogenetic tree construction using stochastic optimization and clustering
Bilingually Motivated Domain-Adapted Word Segmentation for Statistical Machine Translation
"... We introduce a word segmentation approach to languages where word boundaries are not orthographically marked, with application to Phrase-Based Statistical Machine Translation (PB-SMT). Instead of using manually segmented monolingual domain-specific corpora to train segmenters, we make use of bilingu ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
We introduce a word segmentation approach to languages where word boundaries are not orthographically marked, with application to Phrase-Based Statistical Machine Translation (PB-SMT). Instead of using manually segmented monolingual domain-specific corpora to train segmenters, we make use of bilingual corpora and statistical word alignment techniques. First of all, our approach is adapted for the specific translation task at hand by taking the corresponding source (target) language into account. Secondly, this approach does not rely on manually segmented training data so that it can be automatically adapted for different domains. We evaluate the performance of our segmentation approach on PB-SMT tasks from two domains and demonstrate that our approach scores consistently among the best results across different data conditions.

