Results 1 - 10
of
10
Learning Document-Level Semantic Properties from Free-text Annotations
"... This paper demonstrates a new method for leveraging unstructured annotations to infer semantic document properties. We consider the domain of product reviews, which are often annotated by their authors with free-text keyphrases, such as “a real bargain ” or “good value. ” We leverage these unstructu ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
This paper demonstrates a new method for leveraging unstructured annotations to infer semantic document properties. We consider the domain of product reviews, which are often annotated by their authors with free-text keyphrases, such as “a real bargain ” or “good value. ” We leverage these unstructured annotations by clustering them into semantic properties, and then tying the induced clusters to hidden topics in the document text. This allows us to predict relevant properties of unannotated documents. Our approach is implemented in a hierarchical Bayesian model with joint inference, which increases the robustness of the keyphrase clustering and encourages document topics to correlate with semantically meaningful properties. We perform several evaluations of our model, and find that it substantially outperforms alternative approaches. 1
New Functions for Unsupervised Asymmetrical Paraphrase Detection
, 2007
"... Monolingual text-to-text generation is an emerging research area ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Monolingual text-to-text generation is an emerging research area
Biology Based Alignments of Paraphrases for Sentence Compression
"... univ-orleans.fr 1 In this paper, we present a study for extracting and aligning paraphrases in the context of Sentence Compression. First, we justify the application of a new measure for the automatic extraction of paraphrase corpora. Second, we discuss the work done by (Barzilay & Lee, 2003) who us ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
univ-orleans.fr 1 In this paper, we present a study for extracting and aligning paraphrases in the context of Sentence Compression. First, we justify the application of a new measure for the automatic extraction of paraphrase corpora. Second, we discuss the work done by (Barzilay & Lee, 2003) who use clustering of paraphrases to induce rewriting rules. We will see, through classical visualization methodologies (Kruskal & Wish, 1977) and exhaustive experiments, that clustering may not be the best approach for automatic pattern identification. Finally, we will provide some results of different biology based methodologies for pairwise paraphrase alignment. 1
Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection
"... Paraphrase detection is the task of examining two sentences and determining whether they have the same meaning. In order to obtain high accuracy on this task, thorough syntactic and semantic analysis of the two statements is needed. We introduce a method for paraphrase detection based on recursive a ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Paraphrase detection is the task of examining two sentences and determining whether they have the same meaning. In order to obtain high accuracy on this task, thorough syntactic and semantic analysis of the two statements is needed. We introduce a method for paraphrase detection based on recursive autoencoders (RAE). Our unsupervised RAEs are based on a novel unfolding objective and learn feature vectors for phrases in syntactic trees. These features are used to measure the word- and phrase-wise similarity between two sentences. Since sentences may be of arbitrary length, the resulting matrix of similarity measures is of variable size. We introduce a novel dynamic pooling layer which computes a fixed-sized representation from the variable-sized matrices. The pooled representation is then used as input to a classifier. Our method outperforms other state-of-the-art approaches on the challenging MSRP paraphrase corpus. 1
Unsupervised Learning of Paraphrases
- In Research in Computer Science. National Polytechnic Institute, Mexico. ISSN
, 2007
"... Abstract. Paraphrasing constitutes a corner stone in many Natural Language Processing fields like monolingual text-to-text generation and automatic text summarization. Indeed, aligned monolingual corpora are likely to boost the learning process of text-to-text generation models. A Paraphrase learnin ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. Paraphrasing constitutes a corner stone in many Natural Language Processing fields like monolingual text-to-text generation and automatic text summarization. Indeed, aligned monolingual corpora are likely to boost the learning process of text-to-text generation models. A Paraphrase learning strategy can be defined as a two-step process: (1) identifying and extracting related sentence pairs from on-line comparable corpora (for example sentences that convey the same information but yet are written in different forms) and (2) applying learning methodologies over the extracted material to induce text-to-text rewriting rules. In this paper, we compare different lexical distance metrics for the identification of related sentences, i.e. paraphrase candidates. In particular, we discuss how different metrics lead to the identification of different types of paraphrases. Finally, the comparisons and discussions give relevant insights towards automatic generation of paraphrase corpora. 1
Evaluating automatic extraction of rules for sentence plan construction
"... The freely available SPaRKy sentence planner uses hand-written weighted rules for sentence plan construction, and a useror domain-specific second-stage ranker for sentence plan selection. However, coming up with sentence plan construction rules for a new domain can be difficult. In this paper, we au ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The freely available SPaRKy sentence planner uses hand-written weighted rules for sentence plan construction, and a useror domain-specific second-stage ranker for sentence plan selection. However, coming up with sentence plan construction rules for a new domain can be difficult. In this paper, we automatically extract sentence plan construction rules from the RST-DT corpus. In our rules, we use only domainindependent features that are available to a sentence planner at runtime. We evaluate these rules, and outline ways in which they can be used for sentence planning. We have integrated them into a revised version of SPaRKy. 1
A sentence generator for Dutch Daniël
"... The paper presents an efficient, wide-coverage, sentence generator for Dutch, which employs the Alpino grammar and lexicon. This generator consists of a chart-based sentence realizer that builds grammatical sentences for a given abstract dependency structure, and a maximum-entropy fluency ranker whi ..."
Abstract
- Add to MetaCart
The paper presents an efficient, wide-coverage, sentence generator for Dutch, which employs the Alpino grammar and lexicon. This generator consists of a chart-based sentence realizer that builds grammatical sentences for a given abstract dependency structure, and a maximum-entropy fluency ranker which selects the most fluent sentence from a set of candidate sentences for a given dependency structure. The coverage, speed and accuracy of the generator is evaluated on several corpora. 1
Learning toFuse Disparate Sentences
"... We present a system for fusing sentences which are drawn from the same source documentbuthavedifferentcontent. Unlikepreviouswork,ourapproachissupervised,training on real-world examplesof sentences fused by professional journalists in the process of editing news articles. Like Filippova and Strube ..."
Abstract
- Add to MetaCart
We present a system for fusing sentences which are drawn from the same source documentbuthavedifferentcontent. Unlikepreviouswork,ourapproachissupervised,training on real-world examplesof sentences fused by professional journalists in the process of editing news articles. Like Filippova and Strube
Using dependency-based . . .
, 2006
"... As research in text-to-text paraphrase generation progresses, it has the potential to improve the quality of generated text. However, the use of paraphrase generation methods creates a secondary problem. We must ensure that generated novel sentences are not inconsistent with the text from which it w ..."
Abstract
- Add to MetaCart
As research in text-to-text paraphrase generation progresses, it has the potential to improve the quality of generated text. However, the use of paraphrase generation methods creates a secondary problem. We must ensure that generated novel sentences are not inconsistent with the text from which it was generated. We propose a machine learning approach be used to filter out inconsistent novel sentences, or False Paraphrases. To train such a filter, we use the Microsoft Research Paraphrase corpus and investigate whether features based on syntactic dependencies can aid us in this task. Like Finch et al. (2005), we obtain a classification accuracy of 75.6%, the best known performance for this corpus. We also examine the strengths and weaknesses of dependency based features and conclude that they may be useful in more accurately classifying cases of False Paraphrase.
A Joint Phrasal and Dependency Model for Paraphrase Alignment
"... Monolingual alignment is frequently required for natural language tasks that involve similar or comparable sentences. We present a new model for monolingual alignment in which the score of an alignment decomposes over both the set of aligned phrases as well as a set of aligned dependency arcs. Optim ..."
Abstract
- Add to MetaCart
Monolingual alignment is frequently required for natural language tasks that involve similar or comparable sentences. We present a new model for monolingual alignment in which the score of an alignment decomposes over both the set of aligned phrases as well as a set of aligned dependency arcs. Optimal alignments under this scoring function are decoded using integer linear programming while model parameters are learned using standard structured prediction approaches. We evaluate our joint aligner on the Edinburgh paraphrase corpus and show significant gains over a Meteor baseline and a state-of-the-art phrase-based aligner. TITLE AND ABSTRACT IN FRENCH Un modèle de phrases et de dépendances pour l’alignement des paraphrases L’alignement monolingue s’impose fréquemment dans les tâches de langue naturelle qui comprennent des phrases similaires. Nous présentons un nouveau modèle pour l’alignement monolingue dans lequel le score d’un alignement tient compte de l’ensemble de phrases alignées et d’un ensemble d’arcs de dépendance alignés. Cette fonction de score donne des alignements en utilisant l’optimisation linéaire, et nous effectuons l’apprentissage des paramètres du modèle avec des méthodes standardes de prédiction structurée. Nous évaluons notre système mixte par rapport au corpus de paraphrases d’Edinburgh et nous démonstron un avantage significatif par rapport á Meteor et á un système de pointe fondé sur l’alignement des phrases.

