Results 1 - 10
of
10
Extracting relations with integrated information using kernel methods
- In Proceedings of the annual meeting of ACL
, 2005
"... Entity relation detection is a form of information extraction that finds predefined relations between pairs of entities in text. This paper describes a relation detection approach that combines clues from different levels of syntactic processing using kernel methods. Information from three different ..."
Abstract
-
Cited by 39 (4 self)
- Add to MetaCart
Entity relation detection is a form of information extraction that finds predefined relations between pairs of entities in text. This paper describes a relation detection approach that combines clues from different levels of syntactic processing using kernel methods. Information from three different levels of processing is considered: tokenization, sentence parsing and deep dependency analysis. Each source of information is represented by kernel functions. Then composite kernels are developed to integrate and extend individual kernels so that processing errors occurring at one level can be overcome by information from other levels. We present an evaluation of these methods on the 2004 ACE relation detection task, using Support Vector Machines, and show that each level of syntactic processing contributes useful information for this task. When evaluated on the official test data, our approach produced very competitive ACE value scores. We also compare the SVM with KNN on different kernels. 1
The NomBank Project: An Interim Report
- In Proceedings of the NAACL/HLT Workshop on Frontiers in Corpus Annotation
, 2004
"... This paper describes NomBank, a project that will provide argument structure for instances of common nouns in the Penn Treebank II corpus. NomBank is part of a larger effort to add additional layers of annotation to the Penn Treebank II corpus. The University of Pennsylvania’s PropBank, NomBank and ..."
Abstract
-
Cited by 36 (1 self)
- Add to MetaCart
This paper describes NomBank, a project that will provide argument structure for instances of common nouns in the Penn Treebank II corpus. NomBank is part of a larger effort to add additional layers of annotation to the Penn Treebank II corpus. The University of Pennsylvania’s PropBank, NomBank and other annotation projects taken together should lead to the creation of better tools for the automatic analysis of text. This paper describes the NomBank project in detail including its specifications and the process involved in creating the resource. 1
The CoNLL-2008 Shared Task on Joint Parsing of Syntactic and Semantic Dependencies
"... The Conference on Computational Natural Language Learning is accompanied every year by a shared task whose purpose is to promote natural language processing applications and evaluate them in a standard setting. In 2008 the shared task was dedicated to the joint parsing of syntactic and semantic depe ..."
Abstract
-
Cited by 29 (0 self)
- Add to MetaCart
The Conference on Computational Natural Language Learning is accompanied every year by a shared task whose purpose is to promote natural language processing applications and evaluate them in a standard setting. In 2008 the shared task was dedicated to the joint parsing of syntactic and semantic dependencies. This shared task not only unifies the shared tasks of the previous four years under a unique dependency-based formalism, but also extends them significantly: this year’s syntactic dependencies include more information such as named-entity boundaries; the semantic dependencies model roles of both verbal and nominal predicates. In this paper, we define the shared task and describe how the data sets were created. Furthermore, we report and analyze the results and describe the approaches of the participating systems.
The Cross-Breeding of Dictionaries
- In Proceedings of LREC-2004
, 2004
"... Especially for English, the number of hand-coded electronic resources available to the Natural Language Processing Community keeps growing: annotated corpora, treebanks, lexicons, wordnets, etc. Unfortunately, initial funding for such projects is much easier to obtain than the additional funding nee ..."
Abstract
-
Cited by 12 (4 self)
- Add to MetaCart
Especially for English, the number of hand-coded electronic resources available to the Natural Language Processing Community keeps growing: annotated corpora, treebanks, lexicons, wordnets, etc. Unfortunately, initial funding for such projects is much easier to obtain than the additional funding needed to enlarge or improve upon such resources. Thus once one proves the usefulness of a resource, it is difficult to make that resource reach its full potential. We discuss techniques for combining dictionary resources and producing others by semi-automatic means. The resources we created using these techniques have become an integral part of our work on NomBank, a project with the goal of annotating noun arguments in the Penn Treebank II corpus (PTB). 1.
OntoNotes: A Unified Relational Semantic Representation
"... The OntoNotes project is creating a corpus of largescale, accurate, and integrated annotation of multiple levels of the shallow semantic structure in text. Such rich, integrated annotation covering many levels will allow for richer, cross-level models enabling significantly better automatic semantic ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
The OntoNotes project is creating a corpus of largescale, accurate, and integrated annotation of multiple levels of the shallow semantic structure in text. Such rich, integrated annotation covering many levels will allow for richer, cross-level models enabling significantly better automatic semantic analysis. At the same time, it demands a robust, efficient, scalable mechanism for storing and accessing these complex inter-dependent annotations. We describe a relational database representation that captures both the inter- and intra-layer dependencies and provide details of an object-oriented API for efficient, multi-tiered access to this data. 1
Parsing and GLARFing
"... according to most parsing precision/recall measures. However, their level of detail is limited by the hand-annotated treebanks from which they derive their grammars. In contrast, some parsers based on hand-coded grammars rank lower on precision/recall measures while providing more detailed syn ..."
Abstract
-
Cited by 9 (7 self)
- Add to MetaCart
according to most parsing precision/recall measures. However, their level of detail is limited by the hand-annotated treebanks from which they derive their grammars. In contrast, some parsers based on hand-coded grammars rank lower on precision/recall measures while providing more detailed syntactic analyses, which greatly enhance NLP applications such as Information Extraction and Machine Translation. In this paper, we propose a strategy for adding the missing detail to treebankbased parser output without lowering the quality of the output.
Theory-supporting treebanks
- In Handbook of Corpus Linguistics. Walter de Gruyter
, 2005
"... The question of how treebank annotation schemes should be related to linguistic theories has been debated as long as treebanks have existed. Historically speaking, it is probably true to say that there has been a development from mostly theoryneutral annotation schemes to more theoretically oriented ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
The question of how treebank annotation schemes should be related to linguistic theories has been debated as long as treebanks have existed. Historically speaking, it is probably true to say that there has been a development from mostly theoryneutral annotation schemes to more theoretically oriented frameworks or even annotation
Discriminative Slot Detection Using Kernel Methods
- In Proceedings of the 20th International Conference on Computational Linguistices
, 2004
"... Most traditional information extraction approaches are generative models that assume events exist in text in certain patterns and these patterns can be regenerated in various ways. These assumptions limited the syntactic clues being considered for finding an event and confined these approaches to a ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Most traditional information extraction approaches are generative models that assume events exist in text in certain patterns and these patterns can be regenerated in various ways. These assumptions limited the syntactic clues being considered for finding an event and confined these approaches to a particular syntactic level. This paper presents a discriminative framework based on kernel SVMs that takes into account different levels of syntactic information and automatically identifies the appropriate clues. Kernels are used to represent certain levels of syntactic structure and can be combined in principled ways as input for an SVM. We will show that by combining a low level sequence kernel with a high level kernel on a GLARF dependency graph, the new approach outperformed a good rule-based system on slot filler detection for MUC-6. 1
Information Extraction from Multiple Syntactic Sources
, 2004
"... Dedicated to my mother iii Acknowledgements I would like to thank my advisor Ralph Grishman for his guidance in academics. He is a great researcher with keen interests in science, constant efforts in doing things by hand and great personality. He is the example I followed and will continue to follow ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Dedicated to my mother iii Acknowledgements I would like to thank my advisor Ralph Grishman for his guidance in academics. He is a great researcher with keen interests in science, constant efforts in doing things by hand and great personality. He is the example I followed and will continue to follow. Without him this thesis would not be possible.
Unsupervised Relation Learning for Event-Focused Question-Answering and Domain Modelling
"... It is a very sad thing that nowadays there is so little useless information. Oscar Wilde In this thesis, we investigate the problem of identifying, within a text, relations that capture information important for event-focused document collections. The presented solutions work with events of various ..."
Abstract
- Add to MetaCart
It is a very sad thing that nowadays there is so little useless information. Oscar Wilde In this thesis, we investigate the problem of identifying, within a text, relations that capture information important for event-focused document collections. The presented solutions work with events of various granularity and we show how to use these relations to improve the performance of a number of natural language processing applications. For a set of related event-focused documents, we introduce a notion of a shallow semantic network based on the relations between the important elements discovered in these documents. This shallow semantic network captures the most important relations among the objects, people, and other elements that are involved in the events described in the input document collection. We present experimental evidence that such a relation-based representation of event-focused documents is superior to techniques that rely on term frequencies for the task of information selection.

