Results 1 - 10
of
45
Grammatical Category Disambiguation by Statistical Optimization
- COMPUTATIONAL LINGUISTICS
, 1988
"... [This paper focuses on the]... task of [part-of-speech] disambiguation, and particularly on a new algorithm called VOLSUNGA, which avoids syntactic-level analysis, yields about 96% accuracy, and runs in far less time and space than previous attempts. The most recent previous algorithm runs in NP (No ..."
Abstract
-
Cited by 148 (0 self)
- Add to MetaCart
[This paper focuses on the]... task of [part-of-speech] disambiguation, and particularly on a new algorithm called VOLSUNGA, which avoids syntactic-level analysis, yields about 96% accuracy, and runs in far less time and space than previous attempts. The most recent previous algorithm runs in NP (Non-Polynomial) time, while VOLSUNGA runs in linear time. This is provably optimal; no improvements in the order of its execution time and space are possible. VOLSUNGA is also robust in cases of ungrammaticality. Improvements to this accuracy may be made, perhaps the most potentially significant being to include some higher-level information. With such additions, the accuracy of statistically-based algorithms will approach 100%; and the few remaining cases may be largely those with which humans also find difficulty. In subsequent sections we examine several disambiguation algorithms. Their techniques, accuracies, and efficiencies are analyzed. After presenting the research carried out to date, a discussion of VOLSUNGA's application to the Brown Corpus...
Subcategorization Acquisition
, 2002
"... Manual development of large subcategorised lexicons has proved difficult because predicates change behaviour between sublanguages, domains and over time. Yet access to a comprehensive subcategorization lexicon is vital for successful parsing capable of recovering predicate-argument relations, and pr ..."
Abstract
-
Cited by 64 (13 self)
- Add to MetaCart
Manual development of large subcategorised lexicons has proved difficult because predicates change behaviour between sublanguages, domains and over time. Yet access to a comprehensive subcategorization lexicon is vital for successful parsing capable of recovering predicate-argument relations, and probabilistic parsers would greatly benefit from accurate information concerning the relative likelihood of different subcategorisation frames (scfs) of a given predicate. Acquisition of subcategorization lexicons from textual corpora has recently become increasingly popular. Although this work has met with some success, resulting lexicons indicate a need for greater accuracy. One significant source of error lies in the statistical filtering used for hypothesis selection, i.e. for removing noise from automatically acquired scfs. This thesis builds on earlier work in verbal subcategorization acquisition, taking as a starting point the problem with statistical filtering. Our investigation shows that statistical filters tend to work poorly because not only is the underlying distribution zipfian, but there is also very little correlation between conditional distribution of
2006b. Reranking and self-training for parser adaptation
- ACL-COLING
"... Statistical parsers trained and tested on the Penn Wall Street Journal (WSJ) treebank have shown vast improvements over the last 10 years. Much of this improvement, however, is based upon an ever-increasing number of features to be trained on (typically) the WSJ treebank data. This has led to concer ..."
Abstract
-
Cited by 42 (0 self)
- Add to MetaCart
Statistical parsers trained and tested on the Penn Wall Street Journal (WSJ) treebank have shown vast improvements over the last 10 years. Much of this improvement, however, is based upon an ever-increasing number of features to be trained on (typically) the WSJ treebank data. This has led to concern that such parsers may be too finely tuned to this corpus at the expense of portability to other genres. Such worries have merit. The standard “Charniak parser ” checks in at a labeled precisionrecall f-measure of 89.7 % on the Penn WSJ test set, but only 82.9 % on the test set from the Brown treebank corpus. This paper should allay these fears. In particular, we show that the reranking parser described in Charniak and Johnson (2005) improves performance of the parser on Brown to 85.2%. Furthermore, use of the self-training techniques described in (Mc-Closky et al., 2006) raise this to 87.8% (an error reduction of 28%) again without any use of labeled Brown data. This is remarkable since training the parser and reranker on labeled Brown data achieves only 88.4%. 1
A Connectionist Model of Sentence Comprehension and Production. Unpublished
, 2002
"... The most predominant language processing theories have, for some time, been based largely on structured knowledge and relatively simple rules. These symbolic models intentionally segregate syntactic information processing from statistical information as well as semantic, pragmatic, and discourse inf ..."
Abstract
-
Cited by 30 (3 self)
- Add to MetaCart
The most predominant language processing theories have, for some time, been based largely on structured knowledge and relatively simple rules. These symbolic models intentionally segregate syntactic information processing from statistical information as well as semantic, pragmatic, and discourse influences, thereby minimizing the importance of these potential constraints in learning and processing language. While such models have the advantage of being relatively simple and explicit, they are inadequate to account for learning and validated ambiguity resolution phenomena. In recent years, interactive constraint-based theories of sentence processing have gained increasing support, as a growing body of empirical evidence demonstrates early influences of various factors on comprehension performance. Connectionist networks are one form of model that naturally reflect many properties of constraint-based theories, and thus provide a form in which those theories may be instantiated. Unfortunately, most of the connectionist language models implemented until now have involved severe limitations, restricting the phenomena they could address. Comprehension and production models have, by and large, been limited to simple sentences with small vocabularies (cf. St. John & McClelland, 1990). Most models that have addressed the problem of complex, multi-clausal sentence processing have been prediction networks (cf. Elman, 1991; Christiansen & Chater, 1999a). Although a useful component of a language processing system, prediction does not get at the heart of language: the interface between syntax and semantics.
Exploring the Statistical Derivation of Transformational Rule Sequences for Part-of-Speech Tagging
- MEXICO STATE UNIVERSITY
, 1994
"... ..."
From Baby Steps to Leapfrog: How “Less is More” in unsupervised dependency parsing
- IN NAACL-HLT
"... We present three approaches for unsupervised grammar induction that are sensitive to data complexity and apply them to Klein and Manning’s Dependency Model with Valence. The first, Baby Steps, bootstraps itself via iterated learning of increasingly longer sentences and requires no initialization. Th ..."
Abstract
-
Cited by 19 (5 self)
- Add to MetaCart
We present three approaches for unsupervised grammar induction that are sensitive to data complexity and apply them to Klein and Manning’s Dependency Model with Valence. The first, Baby Steps, bootstraps itself via iterated learning of increasingly longer sentences and requires no initialization. This method substantially exceeds Klein and Manning’s published scores and achieves 39.4 % accuracy on Section 23 (all sentences) of the Wall Street Journal corpus. The second, Less is More, uses a low-complexity subset of the available data: sentences up to length 15. Focusing on fewer but simpler examples trades off quantity against ambiguity; it attains 44.1% accuracy, using the standard linguisticallyinformed prior and batch training, beating state-of-the-art. Leapfrog, our third heuristic, combines Less is More with Baby Steps by mixing their models of shorter sentences, then rapidly ramping up exposure to the full training set, driving up accuracy to 45.0%. These trends generalize to the Brown corpus; awareness of data complexity may improve other parsing models and unsupervised algorithms.
Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing
, 1996
"... The purpose of this book is to present a collection of papers that represents a broad spectrum of current research in learning methods for natural language processing, and to advance the state of the art in language learning and artificial intelligence. The book should bridge a gap between several a ..."
Abstract
-
Cited by 18 (10 self)
- Add to MetaCart
The purpose of this book is to present a collection of papers that represents a broad spectrum of current research in learning methods for natural language processing, and to advance the state of the art in language learning and artificial intelligence. The book should bridge a gap between several areas that are usually discussed separately, including connectionist, statistical, and symbolic methods. In order to bring together new and different language learning approaches, we held a workshop at the International Joint Conference on Artificial Intelligence in Montreal in August 1995. Paper contributions were selected and revised after having been reviewed by at least twomembers of the international program committee as well as additional reviewers. This book contains the revised workshop papers and additional papers by members of the program committee. In particular this book focuses on current issues such as: -- How can we apply existing learning methods to language processing? -- What new learning methods are needed for language processing and why? -- What language knowledge should be learned and why?
A Layered Approach To Nlp-Based Information Retrieval
, 1998
"... A layered approach to information retrieval permits the inclusion of multiple search engines as well as multiple databases, with a natural language layer to convert English queries for use by the various search engines. The NLP layer incorporates morphological analysis, noun phrase syntax, and seman ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
A layered approach to information retrieval permits the inclusion of multiple search engines as well as multiple databases, with a natural language layer to convert English queries for use by the various search engines. The NLP layer incorporates morphological analysis, noun phrase syntax, and semantic expansion based on WordNet. 1 Introduction This paper describes a layered approach to information retrieval, and the natural language component that is a major element in that approach. The layered approach, packaged as Intermezzo TM , was deployed in a pre-product form at a government site. The NLP component has been installed, with a proprietary IR engine, PhotoFile, (Flank, Martin, Balogh and Rothey, 1995), (Flank, Garfield, and Norkin, 1995), at several commercial sites, including Picture Network International (PNI), Simon and Schuster, and John Deere. Intermezzo employs an abstraction layer to permit simultaneous querying of multiple databases. A user enters a query into a clien...
Lexical Acquisition at the Syntax-Semantics Interface: Diathesis Alternations, Subcategorization Frames and Selectional Preferences.
, 2001
"... Concrete inanimate animate liquid gas plant animal human solid moveable not-moveable Figure 2.4: LDOCE semantic space space by keeping to a simple hierarchy. However, it seems likely that a lot of specific predicates will not be adequately catered for. For example, given the 16 core categories ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
Concrete inanimate animate liquid gas plant animal human solid moveable not-moveable Figure 2.4: LDOCE semantic space space by keeping to a simple hierarchy. However, it seems likely that a lot of specific predicates will not be adequately catered for. For example, given the 16 core categories depicted in figure 2.4 the direct object slot of sail would have to be accounted for by the movable class, when a more specific classification would be useful to distinguish, for example, cars, stones and ships. There are now WordNet versions for some European languages other than English (Vossen, 1999). For other languages, producing a new man-made hierarchy is not an easy alternative. The coverage needed for even a restricted domain requires considerable human effort. The noun hyponym hierarchy of WordNet is used as the representation medium for the preferences within this thesis. This makes our preferences prone to the human error inherent in the hierarchy and characteristic of any manmade resource. However, this is to some extent outweighed by the rigorous human effort that has gone into creating this useful taxonomy. WordNet has in excess of 60,000 classes in the hyponym hierarchy with over 88,000 word forms (version 1.5). Using current automatic classification methods for building a hierarchy of reasonable size would require considerable effort in post-editing to avoid incongruous classes and considerable processing time in the first place (Resnik, 1993a). The preferences we obtain are limited to the distinctions made within WordNet. Using corpus data does, to some extent, allow us to obtain preferences for the sublanguage of the corpus, since areas of WordNet that are not relevant to the domain have negligible frequency counts. 2.3 The WordNet Approaches There is a...

