Results 1 - 10
of
13
Tagging English Text with a Probabilistic Model
, 1994
"... In this paper we present some experiments on the use of a probabilistic model to tag English text, i.e. to assign to each word the correct tag (part of speech) in the context of the sentence. The main novelty of these experiments is the use of untagged text in the training of the model. We have used ..."
Abstract
-
Cited by 212 (0 self)
- Add to MetaCart
In this paper we present some experiments on the use of a probabilistic model to tag English text, i.e. to assign to each word the correct tag (part of speech) in the context of the sentence. The main novelty of these experiments is the use of untagged text in the training of the model. We have used a simple triclass Markov model and are looking for the best way to estimate the parameters of this model, depending on the kind and amount of training data provided. Two approaches in particular are compared and combined: using text that has been tagged by hand and computing relative frequency counts, using text without tags and training the model as a hidden Markov process, according to a Maximum Likelihood principle
Part-of-Speech Tagging and Partial Parsing
- Corpus-Based Methods in Language and Speech
, 1996
"... m we can carve o# next. `Partial parsing' is a cover term for a range of di#erent techniques for recovering some but not all of the information contained in a traditional syntactic analysis. Partial parsing techniques, like tagging techniques, aim for reliability and robustness in the face of the va ..."
Abstract
-
Cited by 85 (0 self)
- Add to MetaCart
m we can carve o# next. `Partial parsing' is a cover term for a range of di#erent techniques for recovering some but not all of the information contained in a traditional syntactic analysis. Partial parsing techniques, like tagging techniques, aim for reliability and robustness in the face of the vagaries of natural text, by sacrificing completeness of analysis and accepting a low but non-zero error rate. 1 Tagging The earliest taggers [35, 51] had large sets of hand-constructed rules for assigning tags on the basis of words' character patterns and on the basis of the tags assigned to preceding or following words, but they had only small lexica, primarily for exceptions to the rules. TAGGIT [35] was used to generate an initial tagging of the Brown corpus, which was then hand-edited. (Thus it provided the data that has since been used to train other taggers [20].) The tagger described by Garside [56, 34], CLAWS, was a probabilistic version of TAGGIT, and the DeRose tagger improved on
Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing
, 1996
"... The purpose of this book is to present a collection of papers that represents a broad spectrum of current research in learning methods for natural language processing, and to advance the state of the art in language learning and artificial intelligence. The book should bridge a gap between several a ..."
Abstract
-
Cited by 18 (10 self)
- Add to MetaCart
The purpose of this book is to present a collection of papers that represents a broad spectrum of current research in learning methods for natural language processing, and to advance the state of the art in language learning and artificial intelligence. The book should bridge a gap between several areas that are usually discussed separately, including connectionist, statistical, and symbolic methods. In order to bring together new and different language learning approaches, we held a workshop at the International Joint Conference on Artificial Intelligence in Montreal in August 1995. Paper contributions were selected and revised after having been reviewed by at least twomembers of the international program committee as well as additional reviewers. This book contains the revised workshop papers and additional papers by members of the program committee. In particular this book focuses on current issues such as: -- How can we apply existing learning methods to language processing? -- What new learning methods are needed for language processing and why? -- What language knowledge should be learned and why?
Combining Corpus and Machine-Readable Dictionary Data for Building Bilingual Lexicons
, 1996
"... . This paper describes and discusses some theoretical and practical problems arising from developing a system to combine the structured but incomplete information from machine readable dictionaries (MRDs) with the unstructured but more complete information available in corpora for the creation of a ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
. This paper describes and discusses some theoretical and practical problems arising from developing a system to combine the structured but incomplete information from machine readable dictionaries (MRDs) with the unstructured but more complete information available in corpora for the creation of a bilingual lexical data base, presenting a methodology to integrate information from both sources into a single lexical data structure. The bicord system (BIlingual CORpus-enhanced Dictionaries) involves linking entries in Collins English-French and FrenchEnglish bilingual dictionary with a large English-French and French-English bilingual corpus. We have concentrated on the class of action verbs of movement, building on earlier work on lexical correspondences specific to this verb class between languages (Klavans and Tzoukermann, 1989), (Klavans and Tzoukermann, 1990a), (Klavans and Tzoukermann, 1990b). 1 We first examine the way prototypical verbs of movement are translated in the Collin...
Unsupervised Lexical Learning as Inductive Inference
, 2000
"... To learn a language, the learners must first learn its words, the essential building blocks for utterances. The difficulty in learning words lies in the unavailability of explicit word boundaries in speech input. The learners have to infer lexical items with some innately endowed learning mechanism( ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
To learn a language, the learners must first learn its words, the essential building blocks for utterances. The difficulty in learning words lies in the unavailability of explicit word boundaries in speech input. The learners have to infer lexical items with some innately endowed learning mechanism(s) for regularity detection- regularities in the speech normally indicate word patterns. With respect to Zipf's least-effort principle and Chomsky's thoughts on the minimality of grammar for human language, we hypothesise a cognitive mechanism underlying language learning that seeks for the least-effort representation for input data. Accordingly, lexical learning is to infer the minimal-cost representation for the input under the constraint of permissible representation for lexical items. The main theme of this thesis is to examine how far this learning mechanism can go in unsupervised lexical learning from real language data without any pre-defined (e.g., prosodic and phonotactic) cues, but entirely resting on statistical induction of structural patterns for the most economic representation for the data. We first review
A Stochastic Model Of Intonation For Text-To-Speech Synthesis
- Proceedings Eurospeech '97 (Rhodes
, 1998
"... This paper presents a stochastic model of intonation contours for use in text-to-speech synthesis. The model has two modules, a linguistic module that generates abstract prosodic labels from text, and a phonetic module that generates an F 0 curve from the abstract prosodic labels. This model differs ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
This paper presents a stochastic model of intonation contours for use in text-to-speech synthesis. The model has two modules, a linguistic module that generates abstract prosodic labels from text, and a phonetic module that generates an F 0 curve from the abstract prosodic labels. This model differs from previous work in the abstract prosodic labels used, which can be automatically derived from the training corpus. This feature makes it possible to use large 1 This paper is based on a communication presented at Eurospeech'97 (Vronis et al. 1997) and has been recommended by the Editorial Board of Speech Communication. 2 corpora or several corpora of different speech styles, in addition to making it easy to adapt to new languages. The present paper focuses on the linguistic module, which does not require full syntactic analysis of the text but simply relies on part-of-speech tagging. The results were validated on French by means of a perception test. Listeners did not perceive a signif...
Visualisation of Long Distance Grammatical Collocation Patterns in Language
- In IV2001: 5th International Conference on Information Visualisation
, 2001
"... Research in generic unsupervised learning of language structure applied to the Search for ExtraTerrestrial Intelligence (SETI) and decipherment of unknown languages has sought to build up a generic picture of lexical and structural patterns characteristic of natural language. As part of this toolkit ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Research in generic unsupervised learning of language structure applied to the Search for ExtraTerrestrial Intelligence (SETI) and decipherment of unknown languages has sought to build up a generic picture of lexical and structural patterns characteristic of natural language. As part of this toolkit a generic system is required to facilitate the analysis of behavioural trends amongst selected pairs of terminals and non-terminals alike, regardless of which target natural language was selected. Such a tool may be useful in other areas, such a lexico-grammatical analysis or tagging of corpora. Data-oriented approaches to corpus annotation use statistical n-grams and/or constraint-based models; n-grams or constraints with wider windows can improve error-rates, by examining the topology of the annotation-combination space. We present a visualisation tool to help linguists find "useful" PoS-tag combinations, and cohesion between linguistic annotations at other levels; and suggest some possible applications. 1.
Automatic Acquisition of Noun and Verb Meanings
, 1995
"... A robust Natural Language Processing (NLP) system must be able to automatically acquire the syntax and semantics of unknown words that it encounters during processing. It is inevitable that a real--world NLP system will encounter unknown words since the human lexicon continues to grow. This paper de ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
A robust Natural Language Processing (NLP) system must be able to automatically acquire the syntax and semantics of unknown words that it encounters during processing. It is inevitable that a real--world NLP system will encounter unknown words since the human lexicon continues to grow. This paper describes XXXXX, a system that automatically acquires the meanings of unknown nouns and verbs. XXXXX represents the semantics of nouns and verbs in terms of taxonomies since there is considerable evidence that the human lexicon is largely organized as a taxonomy. The use of taxonomies is common in NLP thus making these methods generally useful. When an unknown word is encountered XXXXX attempts to place the unknown word in a taxonomy. In XXXXX the acquisition of semantics is defined as locating an existing concept node in a concept hierarchy that defines an unknown word. If there is no such node then a node must be created and placed into the concept hierarchy. The former is referred to as the...
Use of weighted finite state transducers in part of speech tagging
, 1997
"... This paper addresses issues in part of speech disambiguation using finite-state transducers and presents two main contributions to the field. One of them is the use of finite-state machines for part of speech tagging. Linguistic and statistical information is represented in terms of weights on trans ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
This paper addresses issues in part of speech disambiguation using finite-state transducers and presents two main contributions to the field. One of them is the use of finite-state machines for part of speech tagging. Linguistic and statistical information is represented in terms of weights on transitions in weighted finite-state transducers. Another contribution is the successful combination of techniques – linguistic and statistical – for word disambiguation, compounded with the notion of word classes.
Tagging French Without Lexical Probabilities - Combining Linguistic Knowledge And Statistical Learning
"... . This paper explores morpho-syntactic ambiguities for French to develop a strategy for part-of-speech disambiguation that a) reflects the complexity of French as an inflected language, b) optimizes the estimation of probabilities, c) allows the user flexibility in choosing a tagset. The problem in ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
. This paper explores morpho-syntactic ambiguities for French to develop a strategy for part-of-speech disambiguation that a) reflects the complexity of French as an inflected language, b) optimizes the estimation of probabilities, c) allows the user flexibility in choosing a tagset. The problem in extracting lexical probabilities from a limited training corpus is that the statistical model may not necessarily represent the use of a particular word in a particular context. In a highly morphologically inflected language, this argument is particularly serious since a word can be tagged with a large number of parts of speech. Due to the lack of sufficient training data, we argue against estimating lexical probabilities to disambiguate parts The work was achieved while the author was at AT&T Bell Laboratories, 600 Mountain Avenue, Murray Hill, NJ 07974--0636 2 EVELYNE TZOUKERMANN ET AL. of speech in unrestricted texts. Instead, we use the strength of contextual probabilities along wi...

