Results 1 - 10
of
266
Automated Discourse Generation Using Discourse Structure Relations
- Artificial Intelligence
, 1993
"... This paper summarizes work over the past five years on the automated planning and generation of multisentence texts using discourse structure relations, placing it in context of ongoing efforts by Computational Linguists and Linguists to understand the structure of discourse. Based on a series of ..."
Abstract
-
Cited by 162 (1 self)
- Add to MetaCart
This paper summarizes work over the past five years on the automated planning and generation of multisentence texts using discourse structure relations, placing it in context of ongoing efforts by Computational Linguists and Linguists to understand the structure of discourse. Based on a series of studies by the author and others, the paper describes how the orientation of generation toward communicative intentions illuminates the central structural role played by intersegment discourse relations. It outlines several facets of discourse structure relations as they are required by and used in text planners --- their nature, number, and extension to associated tasks such as sentence planning and text formatting. In Artificial Intelligence 63, Special Issue on Natural Language Processing, 1993. This work was partially supported by the Rome Air Development Center under RADC contract FQ7619-8903326 -0001. 1 1 Introduction Every day, people produce thousands of words of connected...
Extracting paraphrases from a parallel corpus
- In Proc. of the ACL/EACL
, 2001
"... While paraphrasing is critical both for interpretation and generation of natural language, current systems use manual or semi-automatic methods to collect paraphrases. We present an unsupervised learning algorithm for identification of paraphrases from a corpus of multiple English translations of th ..."
Abstract
-
Cited by 152 (4 self)
- Add to MetaCart
While paraphrasing is critical both for interpretation and generation of natural language, current systems use manual or semi-automatic methods to collect paraphrases. We present an unsupervised learning algorithm for identification of paraphrases from a corpus of multiple English translations of the same source text. Our approach yields phrasal and single word lexical paraphrases as well as syntactic paraphrases. 1
Revision-Based Generation of Natural Language Summaries Providing Historical Background -- Corpus-Based Analysis, Design, Implementation and Evaluation
, 1994
"... Automatically summarizing vast amounts of on-line quantitative data with a short natural language paragraph has a wide range of real-world applications. However, this specific task raises a number of difficult issues that are quite distinct from the generic task of language generation: conciseness, ..."
Abstract
-
Cited by 100 (6 self)
- Add to MetaCart
Automatically summarizing vast amounts of on-line quantitative data with a short natural language paragraph has a wide range of real-world applications. However, this specific task raises a number of difficult issues that are quite distinct from the generic task of language generation: conciseness, complex sentences, floating concepts, historical background, paraphrasing power and implicit content. In this thesis, I address these specific issues by proposing a new generation model in which a first pass builds a draft containing only the essential new facts to report and a second pass incrementally revises this draft to opportunistically add as many background facts as can fit within the space limit. This model requires a new type of linguistic knowledge: revision operations, which specifyies the various ways a draft can...
Has a Consensus NL Generation Architecture Appeared, and is it Psycholinguistically Plausible?
, 1994
"... I survey some recent applications-oriented NL generation systems, and claim that despite very different theoretical backgrounds, these systems have a remarkably similar architecture in terms of the modules they divide the generation process into, the computations these modules perform, and the way ..."
Abstract
-
Cited by 93 (1 self)
- Add to MetaCart
I survey some recent applications-oriented NL generation systems, and claim that despite very different theoretical backgrounds, these systems have a remarkably similar architecture in terms of the modules they divide the generation process into, the computations these modules perform, and the way the modules interact with each other. I also compare this 'consensus architecture' among applied NLG systems with psycholinguistic knowledge about how humans speak, and argue that at least some aspects of the consensns architecture seem to be in agreement with what is known about human language production, despite the fact that psycholinguistic plausibility was not in general a goal of the developers of the surveyed systems.
Annotating expressions of opinions and emotions in language. Language Resources and Evaluation
- Language Resources and Evaluation (formerly Computers and the Humanities
, 2005
"... Abstract. This paper describes a corpus annotation project to study issues in the manual annotation of opinions, emotions, sentiments, speculations, evaluations and other private states in language. The resulting corpus annotation scheme is described, as well as examples of its use. In addition, the ..."
Abstract
-
Cited by 90 (13 self)
- Add to MetaCart
Abstract. This paper describes a corpus annotation project to study issues in the manual annotation of opinions, emotions, sentiments, speculations, evaluations and other private states in language. The resulting corpus annotation scheme is described, as well as examples of its use. In addition, the manual annotation process and the results of an inter-annotator agreement study on a 10,000-sentence corpus of articles drawn from the world press are presented.
A Generative Constituent-Context Model for Improved Grammar Induction
, 2002
"... We present a generative distributional model for the unsupervised induction of natural language syntax which explicitly models constituent yields and contexts. ..."
Abstract
-
Cited by 72 (3 self)
- Add to MetaCart
We present a generative distributional model for the unsupervised induction of natural language syntax which explicitly models constituent yields and contexts.
An Overview of SURGE: a Reusable Comprehensive Syntactic Realization Component
, 1996
"... This paper describes surge, a syntactic realization front-end for natural language generation systems. By gradually integrating complementary aspects of various linguistic theories within the computational framework of functional unification, surge has evolved to be one of the most comprehensive gr ..."
Abstract
-
Cited by 71 (8 self)
- Add to MetaCart
This paper describes surge, a syntactic realization front-end for natural language generation systems. By gradually integrating complementary aspects of various linguistic theories within the computational framework of functional unification, surge has evolved to be one of the most comprehensive grammars of English for language generation available today. It has been successfully re-used in a variety of generators, with very different architectures and application domains. 1 Introduction This paper is an overview of surge (Systemic Unification Realization Grammar of English) a syntactic realization front-end for natural language generation systems. Developed over the last seven years 1 it embeds one of the most comprehensive computational grammar of English for generation available to date. It has been successfully re-used in eight generators, that have little in common in terms of architecture. It has also been used for teaching natural language generation at several academic inst...
Designing Statistical Language Learners: Experiments on Noun Compounds
, 1995
"... Statistical language learning research takes the view that many traditional natural language processing tasks can be solved by training probabilistic models of language on a sufficient volume of training data. The design of statistical language learners therefore involves answering two questions: (i ..."
Abstract
-
Cited by 65 (0 self)
- Add to MetaCart
Statistical language learning research takes the view that many traditional natural language processing tasks can be solved by training probabilistic models of language on a sufficient volume of training data. The design of statistical language learners therefore involves answering two questions: (i) Which of the multitude of possible language models will most accurately reflect the properties necessary to a given task? (ii) What will constitute a sufficient volume of training data? Regarding the first question, though a variety of successful models have been discovered, the space of possible designs remains largely unexplored. Regarding the second, exploration of the design space has so far proceeded without an adequate answer. The goal of this thesis is to advance the exploration of the statistical language learning design space. In pursuit of that goal, the thesis makes two main theoretical contributions: it identifies a new class of designs by providing a novel theory of statistical natural language processing, and it presents the foundations for a predictive theory of data requirements to assist in future design explorations. The first of these contributions is called the meaning distributions theory. This theory

