Results 1 - 10
of
14
A Semantics of Contrast and Information Structure for Specifying Intonation in Spoken Language Generation
, 1996
"... ..."
Practical Issues in Automatic Documentation Generation
, 1994
"... PLANDoc, a system under joint development by Columbia and Bellcore, documents the activity of planning engineers as they study telephone routes. It takes as input a trace of the engineer's interaction with a network planning tool and produces 1-2 page summary. In this paper, we describe the user nee ..."
Abstract
-
Cited by 58 (10 self)
- Add to MetaCart
PLANDoc, a system under joint development by Columbia and Bellcore, documents the activity of planning engineers as they study telephone routes. It takes as input a trace of the engineer's interaction with a network planning tool and produces 1-2 page summary. In this paper, we describe the user needs analysis we performed and how it influenced the devel- opment of PLANDoc. In particular, we show how it pinpointed the need for a sub-language specification, allowing us to identify input messages and to characterize the different sentence paraphrases for realizing them. We focus on the systematic use of conjunction in combination with paraphrase that we developed for PLANDoc, which allows for the generation of summaries that are both concise-avoiding repetition of similar information, and fluentavoiding repetition of similar phrasing.
Empirically Designing and Evaluating a New Revision-Based Model for Summary Generation
- Artificial Intelligence
, 1996
"... this paper, we present a system for summarizing quantitative data in natural language, focusing on how we used a corpus of basketball game summaries, drawn from online news services, to empirically shape the system design and to evaluate our approach. Our initial study of this newswire summary corpu ..."
Abstract
-
Cited by 47 (5 self)
- Add to MetaCart
this paper, we present a system for summarizing quantitative data in natural language, focusing on how we used a corpus of basketball game summaries, drawn from online news services, to empirically shape the system design and to evaluate our approach. Our initial study of this newswire summary corpus revealed characteristics of summary texts that are not handled by existing generation systems: -- Sentence complexity: Sentences are quite complex ranging from 23 to 46 words in length, typically conveying from 4 to 12 simple facts in a single sentence. -- Floating concepts: While some concepts consistently appear in fixed locations across reports, others float, appearing potentially anywhere in the report structure. Floating concepts appear to be opportunistically realized where the form of the surrounding text allows. -- Paraphrasing power: Since concisely conveying floating facts requires opportunistically adding them where the surrounding text allows, a single fact type is typically expressed by a wide variety of linguistic forms, each suitable in different textual contexts. -- Historical background: Summaries contain background facts, to illustrate how new reported facts relate to previous events, thus highlighting their significance. -- Conciseness: Summaries must convey as much information as possible in limited space and to do so, facts are concisely expressed by small phrases, sometimes even a single word, woven into the remainder of the report. An example news summary illustrating each of these issues is given in Fig. 1 p. 6. We present a system streak, which embodies a new generation model in order to produce summaries with these specific characteristics. streak
An Information Structural Approach to Spoken Language Generation
, 1996
"... This paper presents an architecture for the generation of spoken monologues with contextually appropriate intonation. A twotiered information structure representation is used in the high-level content planning and sentence planning stages of generation to produce efficient, coherent speech that make ..."
Abstract
-
Cited by 23 (2 self)
- Add to MetaCart
This paper presents an architecture for the generation of spoken monologues with contextually appropriate intonation. A twotiered information structure representation is used in the high-level content planning and sentence planning stages of generation to produce efficient, coherent speech that makes certain discourse relationships, such as explicit contrasts, appropriately salient. The system is able to produce appropriate intonational patterns that cannot be generated by other systems which rely solely on word class and given/new distinctions.
Architectures for natural language generation: Problems and perspectives
- IN TRENDS IN NATURAL LANGUAGE GENERATION: AN ARTIFICIAL INTELLIGENCE PERSPECTIVE
, 1996
"... Current research in natural language generation is situated in a computational linguistics tradition that was founded several decades ago. We critically analyse some of the architectural assumptions underlying existing systems and point out some problems in the domains of text planning and lexicaliz ..."
Abstract
-
Cited by 22 (0 self)
- Add to MetaCart
Current research in natural language generation is situated in a computational linguistics tradition that was founded several decades ago. We critically analyse some of the architectural assumptions underlying existing systems and point out some problems in the domains of text planning and lexicalization. Guided by the identification of major generation challenges viewed from the angles of knowledge-based systems and cognitive psychology, we sketch some new directions for future research.
Generating Summaries of Work Flow Diagrams
, 1996
"... FLOWDOC is a prototype text generator that summarizes information from work flow graphs in a business re-engineering context. A richer ontology than is typically used allows generalization of input data during content selection, and combination of data during sentence planning. Keywords: Summarizati ..."
Abstract
-
Cited by 14 (6 self)
- Add to MetaCart
FLOWDOC is a prototype text generator that summarizes information from work flow graphs in a business re-engineering context. A richer ontology than is typically used allows generalization of input data during content selection, and combination of data during sentence planning. Keywords: Summarization; Data combining; Content planning; Text generation; Industrial applications. 1 Introduction FLOWDOC is a prototype application of natural language generation technology to business reengineering (Rummler & Brache 90) . It provides automatic documentation within a software-aided business re-engineering environment under construction at Bellcore. Its input comes from SHOWBIZ (Wittenburg 96) , a GUI used in this environment to represent as work flow diagrams both the Present Mode of Operation (PMO) of a working group and the Future Modes of Operation (FMOs) suggested by re-engineering consultants. FLOWDOC summarizes the key properties of a SHOWBIZ work flow in a few natural language sent...
Corpus Analysis for Revision-Based Generation of Complex Sentences
- In Proceedings of the 11th National Conference on Artificial Intelligence
, 1993
"... The complex sentences of newswire reports contain floating content units that appear to be opportunistically placed where the form of the surrounding text allows. We present a corpus analysis that identified precise semantic and syntactic constraints on where and how such information is realized. T ..."
Abstract
-
Cited by 13 (4 self)
- Add to MetaCart
The complex sentences of newswire reports contain floating content units that appear to be opportunistically placed where the form of the surrounding text allows. We present a corpus analysis that identified precise semantic and syntactic constraints on where and how such information is realized. The result is a set of revision tools that form the rule base for a report generation system, allowing incremental generation of complex sentences. Introduction Generating reports that summarize quantitative data raises several challenges for language generation systems. First, sentences in such reports are very complex (e.g., in newswire basketball game summaries the lead sentence ranges from 21 to 46 words in length). Second, while some content units consistently appear in fixed locations across reports (e.g., game results are always conveyed in the lead sentence), others float, appearing anywhere in a report and at different linguistic ranks within a given sentence. Floating content uni...
Evaluating the Portability of Revision Rules for Incremental Summary Generation
- In Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics
, 1996
"... This paper presents a quantitative evaluation of the portability to the stock market domain of the revision rule hierarchy used by the system STREAK to incrementally generate newswire sports summaries. ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
This paper presents a quantitative evaluation of the portability to the stock market domain of the revision rule hierarchy used by the system STREAK to incrementally generate newswire sports summaries.
Automatic Generation and Revision of Natural Language Report Summaries Providing Historical Background
- In Proceedings of the 11th Brazilian Symposium on Artificial Intelligence
, 1994
"... Summarization applications raise several challenging issues for language generation systems. To address them, I propose a new generation model where an initial draft conveying only the essential information is incrementally revised to include additional background information. The generator streak, ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Summarization applications raise several challenging issues for language generation systems. To address them, I propose a new generation model where an initial draft conveying only the essential information is incrementally revised to include additional background information. The generator streak, implementing this model, relies on revision operations specifying the various ways a draft can be transformed in order to concisely accommodate a new piece of information. These operations are, for the most part, domain-independent. 1 Introduction In recent years, the volume of information available on-line has grown exponentially, and several large-scale efforts, such as the information superhighway and digital library initiatives in the US, are currently under way to further accelerate this growth. In order to put this abundance of electronic data to good use, and avoid information inundation, the development of automatic summarization facilities is critical. On-line information comes i...
Toward a Morphosyntactic User Model for Language Analysis and Generation: A PhD Proposal
, 1999
"... This proposal paper is being presented in partial fulfillment of the Ph.D. requirements of the Department of Computer and Information Sciences at the University of Delaware. In this paper, I discuss a user modeling architecture for ICICLE, a natural language system intended for use as a writing tuto ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
This proposal paper is being presented in partial fulfillment of the Ph.D. requirements of the Department of Computer and Information Sciences at the University of Delaware. In this paper, I discuss a user modeling architecture for ICICLE, a natural language system intended for use as a writing tutor for deaf learners of written English. This proposed design, intended to model dynamic aspects of a learner over the passage of time, the acquisition of new knowledge, and multiple sessions with the system, includes components to track the history of interaction with a given user as well as a very complex, dynamic model of user interlanguage grammar and domain knowledge. It has been based on research in language acquisition and in the acquisition of cognitive skills. The focus of the work described in this proposal is the development of the model of interlanguage status, which will be used in the analysis of user language production and in the generation of user-tailored explanations. Conte...

