Results 1 - 10
of
35
An Investigation into the Validity of Some Metrics for Automatically Evaluating Natural Language Generation Systems
"... There is growing interest in using automatically computed corpus-based evaluation metrics to evaluate Natural Language Generation (NLG) systems, because these are often considerably cheaper than the human-based evaluations which have traditionally been used in NLG. We review previous workon NLG eval ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
There is growing interest in using automatically computed corpus-based evaluation metrics to evaluate Natural Language Generation (NLG) systems, because these are often considerably cheaper than the human-based evaluations which have traditionally been used in NLG. We review previous workon NLG evaluation and on validation of automatic metrics in NLP, and then present the results of two studies of how well some metrics which are popular in other areas of NLP (notably BLEU and ROUGE) correlate with human judgments in the domain of computer-generated weather forecasts. Our results suggest that, at least in this domain, metrics may provide a useful measure of language quality, although the evidence for this is not as strong as we would ideally like to see; however, they do not provide a useful measure of content quality. We also discuss a number of caveats which must be kept in mind when interpreting this and other validation studies. 1.
Selecting the Content of Textual Descriptions of Geographically Located Events in Spatio-Temporal Weather Data
"... In several domains spatio-temporal data consisting of references to both space and time are collected in large volumes. Textual summaries of spatio-temporal data will complement the map displays used in Geographical Information Systems (GIS) to present data to decision makers. In the RoadSafe projec ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
In several domains spatio-temporal data consisting of references to both space and time are collected in large volumes. Textual summaries of spatio-temporal data will complement the map displays used in Geographical Information Systems (GIS) to present data to decision makers. In the RoadSafe project we are working on developing Natural Language Generation (NLG) techniques to generate textual summaries of spatiotemporal numerical weather prediction data. Our approach exploits existing video processing techniques to analyse spatio-temporal weather prediction data and uses Qualitative Spatial Reasoning(QSR) techniques to reason with geographical data in order to compute the required content (information) for generating descriptions of geographically located events. Our evaluation shows that our approach extracts information similar to human experts. 1
From Data to Text in the Neonatal Intensive Care Unit: Using NLG Technology for Decision Support and Information Management
- AI COMMUNICATIONS
, 2009
"... amounts of patient data in various formats, making efficient processing of information by medical professionals difficult. Moreover, different stakeholders in the neonatal scenario, which include parents as well as staff occupying different roles, have different information requirements. This paper ..."
Abstract
-
Cited by 6 (5 self)
- Add to MetaCart
amounts of patient data in various formats, making efficient processing of information by medical professionals difficult. Moreover, different stakeholders in the neonatal scenario, which include parents as well as staff occupying different roles, have different information requirements. This paper describes recent and ongoing work on building systems that automatically generate textual summaries of neonatal data. Our evaluation results show that the technology is viable and comparable in its effectiveness for decision support to existing presentation modalities. We discuss the lessons learned so far, as well as the major challenges involved in extending current technology to deal with a broader range of data types, and to improve the textual output in the form of more coherent summaries.
Using Spatial Reference Frames to Generate Grounded Textual Summaries of Georeferenced Data
"... Summarising georeferenced (can be identified according to it’s location) data in natural language is challenging because it requires linking events describing its nongeographic attributes to their underlying geography. This mapping is not straightforward as often the only explicit geographic informa ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Summarising georeferenced (can be identified according to it’s location) data in natural language is challenging because it requires linking events describing its nongeographic attributes to their underlying geography. This mapping is not straightforward as often the only explicit geographic information such data contains is latitude and longitude. In this paper we present an approach to generating textual summaries of georeferenced data based on spatial reference frames. This approach has been implemented in a data-to-text system we have deployed in the weather forecasting domain. 1
Introducing shared task evaluation to nlg: The TUNA shared task evaluation challenges
- In Emiel Krahmer and Mariët Theune, editors, Empirical Methods in Natural Language Generation, volume 5790 of Lecture Notes in Artificial Intelligence (LNAI
, 2010
"... Abstract. Shared Task Evaluation Challenges (stecs) have only recently begun in the field of nlg. The tuna stecs, which focused on Referring Expression Generation (reg), have been part of this development since its inception. This chapter looks back on the experience of organising the three tuna Cha ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Abstract. Shared Task Evaluation Challenges (stecs) have only recently begun in the field of nlg. The tuna stecs, which focused on Referring Expression Generation (reg), have been part of this development since its inception. This chapter looks back on the experience of organising the three tuna Challenges, which came to an end in 2009. While we discuss the role of the stecs in yielding a substantial body of research on the reg problem, which has opened new avenues for future research, our main focus is on the role of different evaluation methods in assessing the output quality of reg algorithms, and on the relationship between such methods. 1
What do you want to know? Investigating the information requirements of patient supporters
- in: Proceedings of the Workshop on Personalisation for E-Health, held in conjunction with the 21st IEEE International Symposium on Computer-Based Medical Systems (CBMS-08
, 2008
"... There is a vast amount of data associated with any one patient. It is challenging for medical staff to understand all this data. It is even harder for a lay person, who may not even know what medical terms mean. The research project BabyTalk-Clan aims to create personalized summaries of data for a l ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
There is a vast amount of data associated with any one patient. It is challenging for medical staff to understand all this data. It is even harder for a lay person, who may not even know what medical terms mean. The research project BabyTalk-Clan aims to create personalized summaries of data for a lay audience. It uses sensitive, highly-detailed clinical data relating to a patient. This includes medication given, test results, notes made by medical staff, and continuous physiological signals such as heart rate. We took a qualitative approach to knowledge acquisition for user requirements. Using interviews and a focus group within a Grounded Theory methodology, we established what information lay users want in these medical summaries, and the degree of summarization they require. Findings were cross-validated through a questionnaire. 1.
How Much to Tell? Disseminating Affective Information across a Social Network.
"... Abstract. We are developing a computer system which provides information about babies in neonatal intensive care to family members and friends. A key question is how to personalize the content and complexity of this sensitive affective information appropriately for varied recipients. A novel approac ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Abstract. We are developing a computer system which provides information about babies in neonatal intensive care to family members and friends. A key question is how to personalize the content and complexity of this sensitive affective information appropriately for varied recipients. A novel approach to modeling user requirements for this personalization is described, that employs a simplified social network technique. Further refinements of the model to incorporate people’s information requirements and ability to cope with affective material are then discussed.
Towards a possibility-theoretic approach to uncertainty in medical data interpretation for text generation
- In Proceedings of the Workshop on Knowledge Representation for Healthcare (KR4HC-2009
, 2009
"... medical data interpretation for text generation ..."
Generating and Validating Abstracts of Meeting Conversations: a User Study
"... In this paper we present a complete system for automatically generating natural language abstracts of meeting conversations. This system is comprised of components relating to interpretation of the meeting documents according to a meeting ontology, transformation or content selection from that sourc ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
In this paper we present a complete system for automatically generating natural language abstracts of meeting conversations. This system is comprised of components relating to interpretation of the meeting documents according to a meeting ontology, transformation or content selection from that source representation to a summary representation, and generation of new summary text. In a formative user study, we compare this approach to gold-standard human abstracts and extracts to gauge the usefulness of the different summary types for browsing meeting conversations. We find that our automatically generated summaries are ranked significantly higher than human-selected extracts on coherence and usability criteria. More generally, users demonstrate a strong preference for abstract-style summaries over extracts. 1
A Comparison of Hedged and Non-hedged NLG Texts
"... We assess the use of hedge phrases in “affective” NLG texts. A simple experiment suggests non-native speakers prefer texts that contain hedge phrases, but native speakers prefer texts that do not contain hedge phrases. 1 ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
We assess the use of hedge phrases in “affective” NLG texts. A simple experiment suggests non-native speakers prefer texts that contain hedge phrases, but native speakers prefer texts that do not contain hedge phrases. 1

