Results 1 -
6 of
6
Towards Understanding Spontaneous Speech: Word Accuracy Vs. Concept Accuracy
- In Proceedings of the Fourth International Conference on Spoken Language Processing (ICSLP 96
, 1996
"... In this paper we describe an approach to automatic evaluation of both the speech recognition and understanding capabilities of a spoken dialogue system for train time table information. We use word accuracy for recognition and concept accuracy for understanding performance judgement. Both measures a ..."
Abstract
-
Cited by 52 (6 self)
- Add to MetaCart
In this paper we describe an approach to automatic evaluation of both the speech recognition and understanding capabilities of a spoken dialogue system for train time table information. We use word accuracy for recognition and concept accuracy for understanding performance judgement. Both measures are calculated by comparing these modules' output with a correct reference answer. We report evaluation results for a spontaneous speech corpus with about 10000 utterances. We observed a nearly linear relationship between word accuracy and concept accuracy.
New Methods, Current Trends and Software Infrastructure for NLP
- Bilkent University, Turkey
, 1996
"... . The increasing use of `new methods' in NLP, which this conference series exemplifies, occurs in the context of a wider shift in the nature and concerns of the discipline. This paper begins with a short review of this context and significant trends in the field. The review motivates and leads to a ..."
Abstract
-
Cited by 13 (5 self)
- Add to MetaCart
. The increasing use of `new methods' in NLP, which this conference series exemplifies, occurs in the context of a wider shift in the nature and concerns of the discipline. This paper begins with a short review of this context and significant trends in the field. The review motivates and leads to a set of requirements for support software of general utility for NLP research and development workers. A freely-available system designed to meet these requirements is described (called GATE - a General Architecture for Text Engineering). Information Extraction (IE), in the sense defined by the Message Understanding Conferences (ARPA [2]), is an NLP application in which many of the new methods have found a home (Hobbs [18]; Jacobs ed. [19]). An IE system based on GATE is also available for research purposes, and this is described. Lastly we review related work. 1 Introduction The central theme of this paper is support software (or software infrastructure) for NLP research and development (R&...
The GRACE French Part-of-Speech Tagging Evaluation Task
- proceedings of the First International Conference on Language Resources and Evaluation (LREC
, 1998
"... The GRACE evaluation program aims at applying the Evaluation Paradigm to the evaluation of Part-of-Speech taggers for French. An interesting by-product of GRACE is the production of validated language resources necessary for the evaluation. After a brief recall of the origins and the nature of the E ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
The GRACE evaluation program aims at applying the Evaluation Paradigm to the evaluation of Part-of-Speech taggers for French. An interesting by-product of GRACE is the production of validated language resources necessary for the evaluation. After a brief recall of the origins and the nature of the Evaluation Paradigm, we show how it relates to other national and international initiatives. We then present the now ending GRACE evaluation campaign and describe its four main components (corpus building, tagging procedure, lexicon building, evaluation procedure), as well as its internal organization. 1. The Evaluation Paradigm The Evaluation Paradigm has been proposed as a mean to foster development in research and technology in the field of language engineering. Up to now, it has been mostly used in the United States in the framework of the ARPA and NIST projects on automatic processing of spoken and written language. The paradigm is based on a two step process: ffl first, create textual...
An Evaluation of LOLITA and Related Natural Language Processing Systems
, 1998
"... An Evaluation of LOLITA and related Natural Language Processing Systems Paul Callaghan Submitted to the University of Durham for the degree of Ph.D., August 1997 --------------------- This research addresses the question, "how do we evaluate systems like LOLITA?" LOLITA is the Natural Language P ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
An Evaluation of LOLITA and related Natural Language Processing Systems Paul Callaghan Submitted to the University of Durham for the degree of Ph.D., August 1997 --------------------- This research addresses the question, "how do we evaluate systems like LOLITA?" LOLITA is the Natural Language Processing (NLP) system under development at the University of Durham. It is intended as a platform for building NL applications. We are therefore interested in questions of evaluation for such general NLP systems. The thesis has two parts.
Evaluation of Document Retrieval Systems and Query Difficulty
"... There exist several document retrieval (DR) evaluation framework. Two of them (TREC and Amaryllis) are presented in this paper. But evaluation of DR systems is a very difficult task because it has to deal with relevance which is not a clear notion and because we can question the generalization of th ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
There exist several document retrieval (DR) evaluation framework. Two of them (TREC and Amaryllis) are presented in this paper. But evaluation of DR systems is a very difficult task because it has to deal with relevance which is not a clear notion and because we can question the generalization of the results obtained during the campaigns to bigger or different collections. Moreover, systems are mostly evaluated according to their recall/precision results whereas there are a lot of other features that can be evaluated (speed, presentation of the results, user's satisfaction, etc.). Finally, we think it is important to evaluate a query difficulty (both for user/system interaction and system evaluation). In the last part of this paper, we report some preliminary results on query evaluation 1.
Evaluation of Document Retrieval Systems
"... One of the fundamental characteristics of scientific work is measurement. Moreover, it is economically fundamental because design, implementation and maintain are very expensive, particularly for Document Retrieval ..."
Abstract
- Add to MetaCart
One of the fundamental characteristics of scientific work is measurement. Moreover, it is economically fundamental because design, implementation and maintain are very expensive, particularly for Document Retrieval

