Results 1 - 10
of
11
Automatic Detection of Text Genre
, 1997
"... As the text databases available to users become larger and more heterogeneous, genre becomes increasingly important for computational linguistics as a complement to topical and structural principles of classification. We propose a th ..."
Abstract
-
Cited by 112 (0 self)
- Add to MetaCart
As the text databases available to users become larger and more heterogeneous, genre becomes increasingly important for computational linguistics as a complement to topical and structural principles of classification. We propose a th
Stochastic Text Generation
"... This paper suggests why the statistical revolution has left Natural Language Generation (NLG) largely untouched, but identies areas where interacting constraints are hard to model by traditional methods, and where probabilistic models have something to oer. An interesting discourse phenomenon is dis ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
This paper suggests why the statistical revolution has left Natural Language Generation (NLG) largely untouched, but identies areas where interacting constraints are hard to model by traditional methods, and where probabilistic models have something to oer. An interesting discourse phenomenon is discussed, to show how it might be dealt with using one of the handful of existing statistical approaches to NLG, and to identify diculties with their evaluation functions. The paper then argues that the maximum entropy framework oers a better approach, and suggests how it could be applied to problems in generation, using two heuristics to help an evaluator exploit features which would improve the quality of output language.
Question answering techniques for the world wide web
- In Tutorial presentation at EACL
, 2003
"... Question answering systems have become increasingly popular because they deliver users short, succinct answers instead of overloading them with a large number of irrelevant documents. The vast amount of information readily available on the World Wide Web presents new opportunities and challenges for ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Question answering systems have become increasingly popular because they deliver users short, succinct answers instead of overloading them with a large number of irrelevant documents. The vast amount of information readily available on the World Wide Web presents new opportunities and challenges for question answering. In order for question answering systems to benefit from this vast store of useful knowledge, they must cope with large volumes of useless data. Many characteristics of the World Wide Web distinguish Web-based question answering from question answering on closed corpora such as newspaper texts. The Web is vastly larger in size and boasts incredible “data redundancy, ” which renders it amenable to statistical techniques for answer extraction. A data-driven approach can yield high levels of performance and nicely complements traditional question answering techniques driven by information extraction. In addition to enormous amounts of unstructured text, the Web also contains pockets of structured and semistructured knowledge that can serve as a valuable resource for question answering. By organizing these resources and annotating them with natural language, we can successfully incorporate Web knowledge into question answering systems. This tutorial surveys recent Web-based question answering technology, focusing on two separate paradigms: knowledge mining using statistical tools and knowledge annotation using database concepts. Both approaches can employ a wide spectrum of techniques ranging in linguistic sophistication from simple “bag-of-words ” treatments to full syntactic parsing.
Syntactic form and discourse function in natural language generation
, 2003
"... To all writers of unfinished, unread, and unstarted dissertations everywhere. ii Acknowledgements Many thanks to my advisors and my committee: Ellen Prince, to whom I and this project owe an immeasurable intellectual debt; Aravind Joshi, whose subtle yet firm nudgings in the right direction make him ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
To all writers of unfinished, unread, and unstarted dissertations everywhere. ii Acknowledgements Many thanks to my advisors and my committee: Ellen Prince, to whom I and this project owe an immeasurable intellectual debt; Aravind Joshi, whose subtle yet firm nudgings in the right direction make him an invaluable mentor; Robin Clark, for comments, skepti-cism, and high scientific standards; and Matthew Stone, for useful and prompt feedback and encouragement at all points during this project and for more than once lending an or-ganization to my thoughts which was sorely needed. Numerous others have been kind enough to discuss the many issues, problems, and ques-tions I have run into during the course of this work. Their comments have enriched the content and scope of this dissertation. In particular, thanks go to Bonnie Webber, Ivana
E-mail and word processing in the ESL classroom: how the medium affects the message. Language Learning
- Technology
, 2001
"... Computer-based media place new demands on language which can promote variations in language use (cf. Halliday, 1990). Electronic mail has assumed functions and formal features associated with spoken language as well as formal writing (Davis & Brewer, 1997; Maynor, 1994; Murray, 1996). This has impli ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Computer-based media place new demands on language which can promote variations in language use (cf. Halliday, 1990). Electronic mail has assumed functions and formal features associated with spoken language as well as formal writing (Davis & Brewer, 1997; Maynor, 1994; Murray, 1996). This has implications for language instructors: If electronic mail does engender features of both written and spoken language, it is questionable that electronic mail writing will improve academic writing abilities. The present study attempts to provide insights into this issue. Non-native students in an intermediate pre-academic ESL course responded to writing prompts using electronic mail and word processing. Their writing was examined for (1) differences in use of cohesive features (Halliday, 1967; Halliday & Hasan, 1976), (2) length of text produced in each medium, and (3) differences in text-initial contextualization. Results indicate no obvious differences between students ' electronic mail and word-processed writing. However, the electronic mail texts were significantly shorter than the word-processed texts, and text-initial contextualization was more prominent in the word-processed than in the electronic mail texts. The findings raise the question of whether electronic mail benefits students in terms of academic writing development.
Genre distinctions for Discourse in the Penn TreeBank
"... Articles in the Penn TreeBank were identified as being reviews, summaries, letters to the editor, news reportage, corrections, wit and short verse, or quarterly profit reports. All but the latter three were then characterised in terms of features manually annotated in the Penn Discourse TreeBank — d ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Articles in the Penn TreeBank were identified as being reviews, summaries, letters to the editor, news reportage, corrections, wit and short verse, or quarterly profit reports. All but the latter three were then characterised in terms of features manually annotated in the Penn Discourse TreeBank — discourse connectives and their senses. Summaries turned out to display very different discourse features than the other three genres. Letters also appeared to have some different features. The two main findings involve (1) differences between genres in the senses associated with intra-sentential discourse connectives, inter-sentential discourse connectives and inter-sentential discourse relations that are not lexically marked; and (2) differences within all four genres between the senses of discourse relations not lexically marked and those that are marked. The first finding means that genre should be made a factor in automated sense labelling of non-lexically marked discourse relations. The second means that lexically marked relations provide a poor model for automated sense labelling of relations that are not lexically marked. 1
Chapter ADVANCES IN COMPUTERS 10/6/2003
"... Cognitive Hacking In this chapter, we define and propose countermeasures for a category of computer security exploits which we call "cognitive hacking. " Cognitive hacking refers to a computer or information system attack that relies on changing human users ' perceptions and corresponding ..."
Abstract
- Add to MetaCart
Cognitive Hacking In this chapter, we define and propose countermeasures for a category of computer security exploits which we call "cognitive hacking. " Cognitive hacking refers to a computer or information system attack that relies on changing human users ' perceptions and corresponding behaviors in order to be successful. This is in contrast to denial of service (DOS) and other kinds of well-known attacks that operate solely within the computer and network infrastructure. Examples are given of several cognitive hacking techniques, and a taxonomy for these types of attacks is developed. Legal, economic, and
Introduction Film in interlanguage pragmatics research’
"... Inter-language pragmatics (ILP) research is still in its infancy, particularly where research methodology is concerned. The overwhelming ..."
Abstract
- Add to MetaCart
Inter-language pragmatics (ILP) research is still in its infancy, particularly where research methodology is concerned. The overwhelming
Substitution Of Paraverbal and Nonverbal . . .
"... For a long time, dialogue analysis has separated spoken language from written language. This dichotomy is valid in many cases as it is mostly parallel to the distinction of synchronous and asynchronous communication with the associated affinity to graphic and acoustic media, respectively. During the ..."
Abstract
- Add to MetaCart
For a long time, dialogue analysis has separated spoken language from written language. This dichotomy is valid in many cases as it is mostly parallel to the distinction of synchronous and asynchronous communication with the associated affinity to graphic and acoustic media, respectively. During the

