Results 1 -
3 of
3
Maltparser: A language-independent system for data-driven dependency parsing
- In Proc. of the Fourth Workshop on Treebanks and Linguistic Theories
, 2005
"... ..."
Explorations in sentence fusion
- In Proceedings of the 10th European Workshop on Natural Language Generation
, 2005
"... Sentence fusion is a text-to-text (revision-like) generation task which takes related sentences as input and merges these into a single output sentence. In this paper we describe our ongoing work on developing a sentence fusion module for Dutch. We propose a generalized version of alignment which no ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
Sentence fusion is a text-to-text (revision-like) generation task which takes related sentences as input and merges these into a single output sentence. In this paper we describe our ongoing work on developing a sentence fusion module for Dutch. We propose a generalized version of alignment which not only indicates which words and phrases should be aligned but also labels these in terms of a small set of primitive semantic relations, indicating how words and phrases from the two input sentences relate to each other. It is shown that human labelers can perform this task with a high agreement (Fscore of.95). We then describe and evaluate our adaptation of an existing automatic alignment algorithm, and use the resulting alignments, plus the semantic labels, in a generalized fusion and generation algorithm. A small-scale evaluation study reveals that most of the resulting sentences are adequate to good. 1
A corpus of dutch aphasic speech: Sketching the design and performing a pilot study
"... In this thesis, a pilot study for the development of a corpus of Dutch aphasic speech (CoDAS) is presented. Given the lack of resources of this kind not only for Dutch but also for other languages, CoDAS will be able to set standards and will contribute to the future research in this area. A corpus ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
In this thesis, a pilot study for the development of a corpus of Dutch aphasic speech (CoDAS) is presented. Given the lack of resources of this kind not only for Dutch but also for other languages, CoDAS will be able to set standards and will contribute to the future research in this area. A corpus of Dutch aphasic speech should fulfill at least three requirements. First, it should en-code a plausible sample of contemporary Dutch as spoken by aphasic patients. That is, it should include speech representing different types of aphasia as well as various communication settings. Secondly, the speech fragments should be documented with the relevant metadata which should include information about the speaker and aphasia. Thirdly, the corpus should be enriched with various kinds of linguistic information. Given the special character of the speech contained in CoDAS, we cannot simply carry over the design and the annotation protocols of existing corpora, such as SDC or CHILDES. However, they have been assumed as starting point. In our pilot study, we have established the basic requirements with respect to text types, metadata, and annotation levels that CoDAS should fulfill. In this respect, we have investigated whether and how the procedures and protocols for

