Results 1 - 10
of
40
A Formal Framework for Linguistic Annotation
- Speech Communication
, 2000
"... `Linguistic annotation' covers any descriptive or analytic notations applied to raw language data. The basic data may be in the form of time functions -- audio, video and/or physiological recordings -- or it may be textual. The added notations may include transcriptions of all sorts (from phonetic ..."
Abstract
-
Cited by 97 (18 self)
- Add to MetaCart
`Linguistic annotation' covers any descriptive or analytic notations applied to raw language data. The basic data may be in the form of time functions -- audio, video and/or physiological recordings -- or it may be textual. The added notations may include transcriptions of all sorts (from phonetic features to discourse structures), part-of-speech and sense tagging, syntactic analysis, `named entity' identification, co-reference annotation, and so on. While there are several ongoing efforts to provide formats and tools for such annotations and to publish annotated linguistic databases, the lack of widely accepted standards is becoming a critical problem. Proposed standards, to the extent they exist, have focused on file formats. This paper focuses instead on the logical structure of linguistic annotations. We survey a wide variety of existing annotation formats and demonstrate a common conceptual core, the annotation graph. This provides a formal framework for constructing, mai...
Information Extraction: Beyond Document Retrieval
- COMPUTATIONAL LINGUISTICS AND CHINESE LANGUAGE PROCESSING
, 1998
"... In this paper we give a synoptic view of the growth text processing technology of information extraction (IE) whose function is to extract information about a pre-specified set of entities, relations or events from natural language textsand to record this information in structured representations ..."
Abstract
-
Cited by 48 (10 self)
- Add to MetaCart
In this paper we give a synoptic view of the growth text processing technology of information extraction (IE) whose function is to extract information about a pre-specified set of entities, relations or events from natural language textsand to record this information in structured representations called templates. Here we describe the nature of the IE task, review the history of the area from its origins in AI work in the 1960's and 70's till the present, discuss the techniques being used to carry out the task, describe application areas where IE systems are or are about to be at work, and conclude with a discussion of the challenges facing the area. What emerges is a picture of an exciting new text processing technology with a host of new applications, both on its own and in conjunction with other technologies, such as information retrieval, machine translation and data mining.
GATE: An Environment to Support Research and Development in Natural Language Engineering
- In Proceedings of the 8th IEEE International Conference on Tools with Artificial Intelligence
, 1996
"... We describe a software environment to support research and development in natural language (NL) engineering. This environment -- GATE (General Architecture for Text Engineering) -- aims to advance research in the area of machine processing of natural languages by providing a software infrastructure ..."
Abstract
-
Cited by 40 (10 self)
- Add to MetaCart
We describe a software environment to support research and development in natural language (NL) engineering. This environment -- GATE (General Architecture for Text Engineering) -- aims to advance research in the area of machine processing of natural languages by providing a software infrastructure on top of which heterogeneous NL component modules may be evaluated and refined individually or may be combined into larger application systems. Thus, GATE aims to support both researchers and developers working on component technologies (e.g. parsing, tagging, morphological analysis) and those working on developing end-user applications (e.g. information extraction, text summarisation, document generation, machine translation, and second language learning). GATE will promote reuse of component technology, permit specialisation and collaboration in large-scale projects, and allow for the comparison and evaluation of alternative technologies. The first release of GATE is now available. 1. Int...
GATE: an Architecture for Development of Robust HLT Applications
- In Recent Advanced in Language Processing
, 2002
"... In this paper we present GATE, a framework and graphical development environment which enables users to develop and deploy language engineering components and resources in a robust fashion. The GATE architecture has enabled us not only to develop a number of successful applications for various langu ..."
Abstract
-
Cited by 27 (0 self)
- Add to MetaCart
In this paper we present GATE, a framework and graphical development environment which enables users to develop and deploy language engineering components and resources in a robust fashion. The GATE architecture has enabled us not only to develop a number of successful applications for various language processing tasks (such as Information Extraction), but also to build and annotate corpora and carry out evaluations on the applications generated. The framework can be used to develop applications and resources in multiple languages, based on its thorough Unicode support.
A Question Answering System Supported by Information Extraction
- In Proceedings of the 1 st Meeting of the North American Chapter of the Association for Computational Linguistics (ANLP-NAACL-00
, 2000
"... This paper discusses an information extraction (IE) system, Textract, in natural language (NL) question answering (QA) and examines the role of IE in QA application. It shows: (i) Named Entity tagging is an important component for QA, (ii) an NL shallow parser provides a structural basis for questio ..."
Abstract
-
Cited by 26 (5 self)
- Add to MetaCart
This paper discusses an information extraction (IE) system, Textract, in natural language (NL) question answering (QA) and examines the role of IE in QA application. It shows: (i) Named Entity tagging is an important component for QA, (ii) an NL shallow parser provides a structural basis for questions, and (iii) high-level domain independent IE can result in a QA breakthrough.
developing language processing components with GATE
, 2002
"... Work on GATE has been partly supported by EPSRC grants GR/K25267 (Large-Scale ..."
Abstract
-
Cited by 26 (5 self)
- Add to MetaCart
Work on GATE has been partly supported by EPSRC grants GR/K25267 (Large-Scale
Multidocument Summarization via Information Extraction
- In Proceedings of the HLT Conference
, 2001
"... We present and evaluate the initial version of RIPTIDES, a system that combines information extraction, extraction-based summarization, and natural language generation to support userdirected multidocument summarization. 1. ..."
Abstract
-
Cited by 24 (4 self)
- Add to MetaCart
We present and evaluate the initial version of RIPTIDES, a system that combines information extraction, extraction-based summarization, and natural language generation to support userdirected multidocument summarization. 1.
Software Infrastructure for Natural Language Processing
, 1997
"... We classify and review current approaches to software infrastructure for research, development and delivery of NLP systems. The task ..."
Abstract
-
Cited by 22 (10 self)
- Add to MetaCart
We classify and review current approaches to software infrastructure for research, development and delivery of NLP systems. The task
Software Architecture for Language Engineering
, 2000
"... This thesis defines the boundaries of Software Architecture for Language Engineering (SALE), an area formed by the intersection of human language computation and software engineering. SALE covers all areas of the provision of infrastructural systems to support research and development of language pr ..."
Abstract
-
Cited by 21 (7 self)
- Add to MetaCart
This thesis defines the boundaries of Software Architecture for Language Engineering (SALE), an area formed by the intersection of human language computation and software engineering. SALE covers all areas of the provision of infrastructural systems to support research and development of language processing software. In order to demonstrate the theory developed in relation to SALE, we present the design, implementation and evaluation of GATE, a General Architecture for Text Engineering, which illustrates in practice many of the theoretical points made.
Coupling Information Retrieval and Information Extraction: A New Text Technology for Gathering Information from the Web
- IN PROCEEDINGS OF THE 5TH COMPUTED-ASSISTED INFORMATION SEARCHING ON INTERNET CONFERENCE (RIAO'97)
, 1997
"... The techniques of information retrieval and information extraction are complementary, but to date there has been little concrete work aimed at integrating the two. We describe how each of these techniques contributes to the process of transferring information from generator to user, summarise the is ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
The techniques of information retrieval and information extraction are complementary, but to date there has been little concrete work aimed at integrating the two. We describe how each of these techniques contributes to the process of transferring information from generator to user, summarise the issues which must be addressed if they are to work together, and report the results of some preliminary experiments on coupling them which indicate that these technologies can be jointly used to construct a structured data resource from free text on the WWW.

