Results 1 - 10
of
18
Analyses for Elucidating Current Question Answering Technology
, 2001
"... In this paper, we take a detailed look at the performance of components of an idealized question answering system on two different tasks: the TREC Question Answering task and a set of reading comprehension exams. We carry out three types of analysis: inherent properties of the data, feature analysis ..."
Abstract
-
Cited by 46 (1 self)
- Add to MetaCart
In this paper, we take a detailed look at the performance of components of an idealized question answering system on two different tasks: the TREC Question Answering task and a set of reading comprehension exams. We carry out three types of analysis: inherent properties of the data, feature analysis, and performance bounds. Based on these analyses we explain some of the performance results of the current generation of Q/A systems and make predictions on future work. In particular, we present four findings: (1) Q/A system performance is correlated with answer repetition, (2) relative overlap scores are more effective than absolute overlap scores, (3) equivalence classes on scoring functions can be used to quantify performance bounds, and (4) perfect answer typing still leaves a great deal of ambiguity for a Q/A system because sentences often contain several items of the same type.
Formal Method Integration via Heterogeneous Notations
, 1997
"... Method integration is the procedure of combining multiple methods to form a new technique. In the context of software engineering, this can involve combining specification techniques, rules and guidelines for design and implementation, and sequences of steps for managing an entire development. In cu ..."
Abstract
-
Cited by 14 (9 self)
- Add to MetaCart
Method integration is the procedure of combining multiple methods to form a new technique. In the context of software engineering, this can involve combining specification techniques, rules and guidelines for design and implementation, and sequences of steps for managing an entire development. In current practice, method integration is often an ad-hoc process, where links between methods are defined on a case-by-case basis. In this dissertation, we examine an approach to formal method integration based on so-called heterogeneous notations: compositions of compatible notations. We set up a basis that can be used to formally define the meaning of compositions of formal and semiformal notations. Then, we examine how this basis can be used in combining methods used for system specification, design, and implementation. We demonst...
An Investigation into the Validity of Some Metrics for Automatically Evaluating Natural Language Generation Systems
"... There is growing interest in using automatically computed corpus-based evaluation metrics to evaluate Natural Language Generation (NLG) systems, because these are often considerably cheaper than the human-based evaluations which have traditionally been used in NLG. We review previous workon NLG eval ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
There is growing interest in using automatically computed corpus-based evaluation metrics to evaluate Natural Language Generation (NLG) systems, because these are often considerably cheaper than the human-based evaluations which have traditionally been used in NLG. We review previous workon NLG evaluation and on validation of automatic metrics in NLP, and then present the results of two studies of how well some metrics which are popular in other areas of NLP (notably BLEU and ROUGE) correlate with human judgments in the domain of computer-generated weather forecasts. Our results suggest that, at least in this domain, metrics may provide a useful measure of language quality, although the evidence for this is not as strong as we would ideally like to see; however, they do not provide a useful measure of content quality. We also discuss a number of caveats which must be kept in mind when interpreting this and other validation studies. 1.
A corpus analysis of discourse relations for Natural Language Generation
- PROCEEDINGS OF CORPUS LINGUISTICS
, 2003
"... We are developing a Natural Language Generation (NLG) system that generates texts tailored for the reading ability of individual readers. As part of building the system, GIRL (Generator for Individual Reading Levels), we carried out an analysis of the RST Discourse Treebank Corpus to find out how hu ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
We are developing a Natural Language Generation (NLG) system that generates texts tailored for the reading ability of individual readers. As part of building the system, GIRL (Generator for Individual Reading Levels), we carried out an analysis of the RST Discourse Treebank Corpus to find out how human writers linguistically realise discourse relations. The goal of the analysis was (a) to create a model of the choices that need to be made when realising discourse relations, and (b) to understand how these choices were typically made for "normal" readers, for a variety of discourse relations. We present our results for discourse relations: concession, condition, elaboration-additional, evaluation, example, reason and restatement. We discuss the results and how they were used in GIRL.
The Principles of Readability
- Costa Mesa, CA: Impact Information
, 2004
"... The principles of readability are in every style manual. Readability formulas are in every word processor. What is missing is the research and theory on which they stand. ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
The principles of readability are in every style manual. Readability formulas are in every word processor. What is missing is the research and theory on which they stand.
An evaluation of procedural instructional text
- In Proceedings of the International Natural Language Generation Conference
, 2002
"... This paper presents an evaluation of the instructional text generated by Isolde, an authoring tool for technical writers that automates the production of procedural on-line help. The evaluation compares the effectiveness of the instructional text produced by Isolde with that of professionally author ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
This paper presents an evaluation of the instructional text generated by Isolde, an authoring tool for technical writers that automates the production of procedural on-line help. The evaluation compares the effectiveness of the instructional text produced by Isolde with that of professionally authored instructions, such as MS Word Help. The results suggest that the documentation produced by Isolde is of comparable quality to similar texts found in commercial manuals. 1
Experiments With Discourse-Level Choices and Readability
- In Proceedings of the European Natural Language Generation Workshop (ENLG), 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL’03
, 2003
"... This paper reports on pilot experiments that are being used, together with corpus analysis, in the development of a Natural Language Generation (NLG) system, GIRL (Generator for Individual Reading Levels). GIRL generates reports for individuals after a literacy assessment. ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
This paper reports on pilot experiments that are being used, together with corpus analysis, in the development of a Natural Language Generation (NLG) system, GIRL (Generator for Individual Reading Levels). GIRL generates reports for individuals after a literacy assessment.
Consumers in the Financial Services Sector
, 1998
"... R e s e a rch Papers Pre p a red for the Task Force on the Future ..."
WPI-CS-TR-03-26 July 2003
"... this document is organized out as follows: Section 2 provides background into search engines and collaborative ltering; Section 3 describes details on the design and implementation of the Foible system; Section 4 describes the user study and performance measures we use to evaluate the bene ts of a ..."
Abstract
- Add to MetaCart
this document is organized out as follows: Section 2 provides background into search engines and collaborative ltering; Section 3 describes details on the design and implementation of the Foible system; Section 4 describes the user study and performance measures we use to evaluate the bene ts of a search engine with personalization; Section 5 analyzes the results from the user study; Section 6 summarizes our conclusions; and Section 7 presents possible future work

