Results 1 - 10
of
11
Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora
, 1997
"... ..."
Some Chart-Based Techniques For Parsing Ill-Formed Input
, 1989
"... We argue f the usefulness of an active chart as the bas of a system that searches for th globally most plausible explanation of failur to syntactically parse a given input. We suggest semantics-free, grammarindependent techniques for parsing inputs displaying simple kinds of RI-formedness and discus ..."
Abstract
-
Cited by 55 (0 self)
- Add to MetaCart
We argue f the usefulness of an active chart as the bas of a system that searches for th globally most plausible explanation of failur to syntactically parse a given input. We suggest semantics-free, grammarindependent techniques for parsing inputs displaying simple kinds of RI-formedness and discuss the search issues involved.
Language Modeling With Sentence-Level Mixtures
, 1994
"... Language models play an important role in improving the accuracy of a continuous speech recognizer. In this thesis, we introduce a new statistical language model which captures long term topic dependencies of words within and across sentences. The model includes two main contributions. First, we dev ..."
Abstract
-
Cited by 23 (1 self)
- Add to MetaCart
Language models play an important role in improving the accuracy of a continuous speech recognizer. In this thesis, we introduce a new statistical language model which captures long term topic dependencies of words within and across sentences. The model includes two main contributions. First, we develop a topic-dependent sentence-level mixture language model which takes advantage of the topic constraints in a sentence or a paragraph. Since this language model is not Markov and has a large search space, it is used only in the last stage of a multi-pass search strategy in the recognizer. Second, we introduce topic-dependent dynamic adaptation techniques in the framework of the mixture model. During the course of this thesis, we also investigate robust parameter estimation techniques, which are extremely important in light of the sparse data problems in language modeling. The model is implemented in the BU speech recognition system and provides a significant improvement in recognition accuracy. An important advantage of the framework of our model is that it is a simple extension of existing language modeling techniques that can easily be integrated with other language modeling advances.
Integrating Natural Language Generation and Hypertext to Produce Dynamic Documents
, 1998
"... We discuss a task requiring the coherent presentation of heterogeneous information about objects recorded in electronic catalogues. We consider the advantages of combining hypermedia delivery with natural language generation technology, so as to allow us to view a session with such a system as a coh ..."
Abstract
-
Cited by 14 (5 self)
- Add to MetaCart
We discuss a task requiring the coherent presentation of heterogeneous information about objects recorded in electronic catalogues. We consider the advantages of combining hypermedia delivery with natural language generation technology, so as to allow us to view a session with such a system as a coherent conversation or dialogue. We describe two prototype systems we have built which make use of these combined techniques, and focus on those aspects of the systems which attempt to provide coherence. Although the techniques themselves are not novel, their combination is relatively recent, and promises to help forge useful tools for accomplishing our specific information retrieval task. Keywords hypermedia, natural language generation, information presentation, discourse coherence, adaptive hypertext The support of the Economic and Social Research Council for hcrc is gratefully acknowledged. The ilex project is funded by uk Engineering and Physical Sciences Research Council, through gra...
Lexicalist Machine Translation of Spatial Prepositions
, 1995
"... This thesis proposes a strongly lexicalist approach to machine translation and applies it to the translation of spatial prepositions and prepositional expressions between English and Spanish. Bilingual contrastive knowledge resides solely in the bilingual lexicon and is structured in the form of cor ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
This thesis proposes a strongly lexicalist approach to machine translation and applies it to the translation of spatial prepositions and prepositional expressions between English and Spanish. Bilingual contrastive knowledge resides solely in the bilingual lexicon and is structured in the form of correspondences between sets of source and target language lexemes related through indices. The resulting architecture maximizes the independence of the monolingual and bilingual components. This independence is demonstrated by developing a grammar of Spanish which is significantly different in its constructions from its analogous English grammar. In particular, relative clauses are analysed through a single rule that allows gaps in subject position, while clitic climbing and doubling are handled through mechanisms not normally found in grammatical descriptions of English. Bilingual lexical rules, in conjunction with the bilingual lexicon, constitute a single, motivated and well defined mechani...
Improving And Predicting Performance Of Statistical Language Models In Sparse Domains
, 1998
"... Standard statistical language models, or n-gram models, which represent the probability of word sequences, suffer from sparse-data problems in tasks where large amounts of domain-specific text are not available. This thesis focuses on improving the estimation of domain-dependent n-gram models by usi ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Standard statistical language models, or n-gram models, which represent the probability of word sequences, suffer from sparse-data problems in tasks where large amounts of domain-specific text are not available. This thesis focuses on improving the estimation of domain-dependent n-gram models by using out-of-domain text data. Previous approaches for estimating language models from multi-domain data have not accounted for the characteristic variations of style and content across domains. In contrast, this thesis introduces two approaches that compensate for multi-domain differences, both representing "style" by part-of-speech (POS) sequences and "content" by the particular choice of words. First, data from multiple domains is combined using similarity weighting schemes that discriminate for content and style relevance prior to pooling multi-domain text. Second, n-gram distributions from multiple domains are combined, via a POS-dependent n-gram framework that separately compensate for word and POS usage differences. Two variations are explored: explicitly transforming the out-of-domain distribution before combining with an in-domain model, and vi separately estimating components of the POS-dependent n-gram model using multidomain data. Finally, measures to analyze and predict recognition performance of language models are also investigated, resulting in an algorithm for predicting performance differences associated with localized changes in language models given a recognition system.
Some applications of natural language processing to the field of augmentative and alternative communication
- Proceedings of the IJCAI-95 Workshop on Developing AI Applications for People with Disabilities
, 1995
"... ..."
Bracketing and aligning words and constituents in parallel text using stochastic inversion transduction grammars
- in Parallel Text Processing: Alignment and Use of Translation Corpora
, 2000
"... parsing Abstract: We introduce (1) a novel stochastic inversion transduction grammar formalism for bilingual language modeling of sentence-pairs, and (2) the concept of bilingual parsing with a variety of parallel corpus analysis applications. Aside from the bilingual orientation, three major featur ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
parsing Abstract: We introduce (1) a novel stochastic inversion transduction grammar formalism for bilingual language modeling of sentence-pairs, and (2) the concept of bilingual parsing with a variety of parallel corpus analysis applications. Aside from the bilingual orientation, three major features distinguish the formalism from the finitestate transducers more traditionally found in computational linguistics: it skips directly to a context-free rather than finite-state base, it permits a minimal extra degree of ordering flexibility, and its probabilistic formulation admits an efficient maximum-likelihood bilingual parsing algorithm. A convenient normal form is shown to exist. Analysis of the formalism's expressiveness suggests that it is particularly well-suited to model ordering shifts between languages, balancing needed flexibility against complexity constraints. We discuss a number of examples of how stochastic inversion transduction grammars bring bilingual constraints to bear upon problematic corpus analysis tasks such as segmentation, bracketing, phrasal alignment, and parsing. 1.
Continuous Understanding: A First Look at CAFE
, 2001
"... Contents 1 Introduction: Conversational Agents 1 2 Continuous Understanding 2 2.1 The Incremental Processing Alternative . . . . . . . . . . . . . . 3 2.2 People Understand Continuously . . . . . . . . . . . . . . . . . . 3 2.3 Weak vs. Strong AI in Human-Computer-Interaction . . . . . . . 7 2.4 Pr ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Contents 1 Introduction: Conversational Agents 1 2 Continuous Understanding 2 2.1 The Incremental Processing Alternative . . . . . . . . . . . . . . 3 2.2 People Understand Continuously . . . . . . . . . . . . . . . . . . 3 2.3 Weak vs. Strong AI in Human-Computer-Interaction . . . . . . . 7 2.4 Practical Value of Continuous Understanding . . . . . . . . . . . 7 2.5 Some Nascent attempts at Continuous Understanding . . . . . . 9 3 Intention Recognition Reigns Supreme 10 3.1 As a Probability Maximization . . . . . . . . . . . . . . . . . . . 13 3.2 What If the Intention is Wrong? . . . . . . . . . . . . . . . . . . 15 3.3 A Productive Interaction . . . . . . . . . . . . . . . . . . . . . . . 16 4 CAFE: In Search of an Architecture 16 4.1 Traditional Architectures . . . . . . . . . . . . . . . . . . . . . . 16 4.2 Some Non-Traditional Alternatives . . . . . . . . . . . . . . . . . 17 4.2.1 Giant BlackBoard System . . . . . . . . . . . . . . .
Markup and the GOLD Ontology
, 2003
"... Figure 4. Upper taxonomy for the ontology For linguistics we are concerned with the grammatical qualities of instances of LINGUISTICUNIT, or those qualities which determine how instances of LINGUISTICUNIT behave in the grammar of a language. Instances of MORPHOSYNTACTICFEATURE include: TENSE, ASPEC ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Figure 4. Upper taxonomy for the ontology For linguistics we are concerned with the grammatical qualities of instances of LINGUISTICUNIT, or those qualities which determine how instances of LINGUISTICUNIT behave in the grammar of a language. Instances of MORPHOSYNTACTICFEATURE include: TENSE, ASPECT, MOOD, NUMBER, PERSON, PARTOFSPEECH, etc. A MORPHOSYNTACTICUNIT is said to stand in a HASGRAMINFO relationship to particular instances of MORPHOSYNTACTICFEATURE.

