Results 1 - 10
of
29
An Empirically-Based System for Processing Definite Descriptions
, 2000
"... this paper, we present an implemented system for processing definite Universidade do Vale do Rio dos Sinos - UNISINOS, Av. Unisinos 950 - Cx. Postal 275, 93022-000 ..."
Abstract
-
Cited by 49 (11 self)
- Add to MetaCart
this paper, we present an implemented system for processing definite Universidade do Vale do Rio dos Sinos - UNISINOS, Av. Unisinos 950 - Cx. Postal 275, 93022-000
Applying Machine Learning for High Performance Named-Entity Extraction
, 1999
"... This paper describes a machine learning approach to build an efficient, accurate and fast name spotting system. Finding names in free text is an important task in addressing real-world text-based applications. Most previous approaches have been based on carefully hand-crafted modules encoding lingui ..."
Abstract
-
Cited by 25 (0 self)
- Add to MetaCart
This paper describes a machine learning approach to build an efficient, accurate and fast name spotting system. Finding names in free text is an important task in addressing real-world text-based applications. Most previous approaches have been based on carefully hand-crafted modules encoding linguistic knowledge specific to the language and document genre. Such approaches have two drawbacks: they require large amounts of time and linguistic expertise to develop, and they are not easily portable to new languages and genres. This paper describes an extensible system which automatically combines weak evidence for name extraction. This evidence is gathered from easily available sources: part-of-speech tagging, dictionary lookups, and textual information such as capitalization and punctuation. Individually, each piece of evidence is insuFFIcient for robust name detection. However, the combination of evidence, through standard machine learning techniques, yields a system that achieves performance equivalent to the best existing hand-crafted approaches.
Evaluation of an algorithm for the recognition and classification of proper names
- In Proceedings of the 16th International Conference on Computational Linguistics (COLING ’96
, 1996
"... We describe an information extraction system in which four classes of naming expressions organisation, person, location and tinm names are recognised and classified with nearly 92 % combined precision and recall. The system applies a mixture of techniques to perform this task and these are described ..."
Abstract
-
Cited by 23 (5 self)
- Add to MetaCart
We describe an information extraction system in which four classes of naming expressions organisation, person, location and tinm names are recognised and classified with nearly 92 % combined precision and recall. The system applies a mixture of techniques to perform this task and these are described in detail. We have quantitatively evaluated the system against a blind test set of Wall Street Journal business articles and report results not only for the system as a whole, but for each component technique and for each class of name. These results show that in order to have high recall, the system needs to make use not only of information internal to the naming expression but also information from outside the nmne. They also show that the contribution of each system component w~ries fl'om one (:lass of name expression to another. 1
Automatic template creation for information extraction
, 1998
"... Information Extraction (IE) approaches currently assume that a template exists which sufficiently defines the requirements of the task. Substantial human effort is required to generate these basic templates and to provide a development corpus. In the two principal IE competitions, the Message Unders ..."
Abstract
-
Cited by 21 (0 self)
- Add to MetaCart
Information Extraction (IE) approaches currently assume that a template exists which sufficiently defines the requirements of the task. Substantial human effort is required to generate these basic templates and to provide a development corpus. In the two principal IE competitions, the Message Understanding Conference (MUC) and Tipster, the templates were constructed directly from the experience of analysts. This manual approach cannot always be assumed. This proposal concerns the automatic construction of MUC-style templates, substantially reducing the human effort required. The approach will carry out a corpus-based analysis of task-relevant documents, identifying and analysing the interaction between the fundamental elements. A resource which defines semantic relationships will be necessary to identify and categorise these fundamental elements. This application is of particular interest to researchers in the field of IE and automatic abstracting. 1 1.
An ascription-based approach to Speech Acts
, 1996
"... The two principal areas of natural language processing research in pragmatics are belief modelling and speech act processing. Belief modelling is the development of techniques to represent the mental attitudes of a dialogue participant. The latter approach, speech act processing, based on speech act ..."
Abstract
-
Cited by 16 (5 self)
- Add to MetaCart
The two principal areas of natural language processing research in pragmatics are belief modelling and speech act processing. Belief modelling is the development of techniques to represent the mental attitudes of a dialogue participant. The latter approach, speech act processing, based on speech act theory, involves viewing dialogue in planning terms. Utterances in a dialogue are modelled as steps in a plan where understanding an utterance involves deriving the complete plan a speaker is attempting to achieve. However, previous speech act ba~sed approaches have been limited by a reliance upon relatively simplistic belief modelling techniques and their relationship to planning and plan recognition. In particular, such techniques assume precomputed nested belief structures. In this paper, we will present an approach to speech act processing based on novel belief modelling techniques where nested beliefs are propagated on demand.
Coupling Information Retrieval and Information Extraction: A New Text Technology for Gathering Information from the Web
- IN PROCEEDINGS OF THE 5TH COMPUTED-ASSISTED INFORMATION SEARCHING ON INTERNET CONFERENCE (RIAO'97)
, 1997
"... The techniques of information retrieval and information extraction are complementary, but to date there has been little concrete work aimed at integrating the two. We describe how each of these techniques contributes to the process of transferring information from generator to user, summarise the is ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
The techniques of information retrieval and information extraction are complementary, but to date there has been little concrete work aimed at integrating the two. We describe how each of these techniques contributes to the process of transferring information from generator to user, summarise the issues which must be addressed if they are to work together, and report the results of some preliminary experiments on coupling them which indicate that these technologies can be jointly used to construct a structured data resource from free text on the WWW.
Improving Machine Translation Quality with Automatic named Entity Recognition
- In Proceedings of the 7 th International EAMT workshop on MT and
, 2003
"... Named entities create serious problems for state-of-the-art commercial machine translation (MT) systems and often cause translation failures beyond the local context, affecting both the overall morphosyntactic well-formedness of sentences and word sense disambiguation in the source text. We report o ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
Named entities create serious problems for state-of-the-art commercial machine translation (MT) systems and often cause translation failures beyond the local context, affecting both the overall morphosyntactic well-formedness of sentences and word sense disambiguation in the source text. We report on the results of an experiment in which MT input was processed using output from the named entity recognition module of Sheffield's GATE information extraction (IE) system. The gain in MT quality indicates that specific components of IE technology could boost the performance of current MT systems. 1.
Rule-Based Named Entity Recognition For Greek Financial Texts
- In Proceedings of the Workshop on Computational Lexicography and Multimedia Dictionaries (COMLEX 2000
, 2000
"... The identification and classification of proper names (named entity recognition) is considered an important task in the area of Information Retrieval and Extraction. A typical named entity recognition (NER) system mainly consists of a lexicon and a grammar. When moving to a new domain, these lexical ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
The identification and classification of proper names (named entity recognition) is considered an important task in the area of Information Retrieval and Extraction. A typical named entity recognition (NER) system mainly consists of a lexicon and a grammar. When moving to a new domain, these lexical resources should be customised, either manually or exploiting machine learning techniques. In this paper, we present a NER system based on hand crafted lexical resources. The system is part of a Greek information extraction system and was tested on a Greek corpus of financial news with satisfactory results. Keywords: information extraction, named entity recognition, pattern matching 1. INTRODUCTION Information Extraction (IE) is the task of automatically extracting information of interest from unconstrained text creating a structured representation of this information. An IE task involves two main sub-tasks: the recognition of the named entities involved in an event and the recognition o...
Definite Description Processing in Unrestricted Text
, 1998
"... Noun phrases with the definite article the, that we call DEFINITE DESCRIPTIONS, following (Russell, 1905), are one of the most common constructs in English, and have been extensively studied by linguists, philosophers, psychologists, and computational linguists. In this dissertation we present an im ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Noun phrases with the definite article the, that we call DEFINITE DESCRIPTIONS, following (Russell, 1905), are one of the most common constructs in English, and have been extensively studied by linguists, philosophers, psychologists, and computational linguists. In this dissertation we present an implemented model of definite description processing that is based on extensive empirical studies of definite description use and whose performance can be quantitatively measured. In almost all approaches to discourse processing and discourse representation, definite descriptions have been regarded as anaphoric; and the models of definite description processing proposed in the literature tend to emphasise the role of common-sense inference mechanisms. Recent work on discourse interpretation (Carletta, 1996; Carletta et al., 1997; Walker and Moore, 1997) has claimed that the judgements on which a theory is based should be shared by more than one subject. On the basis of previous linguistic...

