Results 1 -
4 of
4
A more precise analysis of punctuation for broadcoverage surface realization with CCG
- In Proc. of the Workshop on Grammar Engineering Across Frameworks (GEAF08
, 2008
"... This paper describes a more precise analysis of punctuation for a bi-directional, broad coverage English grammar extracted from the CCGbank (Hockenmaier and Steedman, 2007). We discuss various approaches which have been proposed in the literature to constrain overgeneration with punctuation, and ill ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
This paper describes a more precise analysis of punctuation for a bi-directional, broad coverage English grammar extracted from the CCGbank (Hockenmaier and Steedman, 2007). We discuss various approaches which have been proposed in the literature to constrain overgeneration with punctuation, and illustrate how aspects of Briscoe’s (1994) influential approach, which relies on syntactic features to constrain the appearance of balanced and unbalanced commas and dashes to appropriate sentential contexts, is unattractive for CCG. As an interim solution to constrain overgeneration, we propose a rule-based filter which bars illicit sequences of punctuation and cases of improperly unbalanced apposition. Using the OpenCCG toolkit, we demonstrate that our punctuation-augmented grammar yields substantial increases in surface realization coverage and quality, helping to achieve state-of-the-art BLEU scores. 1
Extraction of Entailed Semantic Relations Through Syntax-based Comma Resolution
"... This paper studies textual inference by investigating comma structures, which are highly frequent elements whose major role in the extraction of semantic relations has not been hitherto recognized. We introduce the problem of comma resolution, defined as understanding the role of commas and extracti ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
This paper studies textual inference by investigating comma structures, which are highly frequent elements whose major role in the extraction of semantic relations has not been hitherto recognized. We introduce the problem of comma resolution, defined as understanding the role of commas and extracting the relations they imply. We show the importance of the problem using examples from Textual Entailment tasks, and present A Sentence Transformation Rule Learner (ASTRL), a machine learning algorithm that uses a syntactic analysis of the sentence to learn sentence transformation rules that can then be used to extract relations. We have manually annotated a corpus identifying comma structures and relations they entail and experimented with both gold standard parses and parses created by a leading statistical parser, obtaining F-scores of 80.2 % and 70.4 % respectively. 1
Syntactically-Informed Models for . . .
"... Providing punctuation in speech transcripts not only improves readability, but it also helps downstream text processing such as information extraction or machine translation. In this paper, we improve by 7 % the accuracy of comma prediction in English broadcast news by introducing syntactic features ..."
Abstract
- Add to MetaCart
Providing punctuation in speech transcripts not only improves readability, but it also helps downstream text processing such as information extraction or machine translation. In this paper, we improve by 7 % the accuracy of comma prediction in English broadcast news by introducing syntactic features inspired by the role of commas as described in linguistics studies. We conduct an analysis of the impact of those features on other subsets of features (prosody, words...) when combined through CRFs. The syntactic cues can help characterizing large syntactic patterns such as appositions and lists which are not necessarily marked by prosody.

