• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Part-of-speech tagging and partial parsing (1996)

by S Abney
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 61
Next 10 →

Robust Accurate Statistical Annotation of General Text

by Ted Briscoe, John Carroll , 2002
"... We describe a robust accurate domain-independent approach to statistical parsing incorporated into the new release of the ANLT toolkit, and publicly available as a research tool. The system has been used to parse many well known corpora in order to produce data for lexical acquisition efforts; it ha ..."
Abstract - Cited by 146 (11 self) - Add to MetaCart
We describe a robust accurate domain-independent approach to statistical parsing incorporated into the new release of the ANLT toolkit, and publicly available as a research tool. The system has been used to parse many well known corpora in order to produce data for lexical acquisition efforts; it has also been used as a component in an open-domain question answering project. The performance of the system is competitive with that of statistical parsers using highly lexicalised parse selection models. However, we plan to extend the system to improve parse coverage, depth and accuracy.

Multimodal Video Indexing: A Review of the State-of-the-art

by Cees G.M. Snoek, Marcel Worring - Multimedia Tools and Applications , 2003
"... Efficient and effective handling of video documents depends on the availability of indexes. Manual indexing is unfeasible for large video collections. In this paper we survey several methods aiming at automating this time and resource consuming process. Good reviews on single modality based video in ..."
Abstract - Cited by 103 (18 self) - Add to MetaCart
Efficient and effective handling of video documents depends on the availability of indexes. Manual indexing is unfeasible for large video collections. In this paper we survey several methods aiming at automating this time and resource consuming process. Good reviews on single modality based video indexing have appeared in literature. Effective indexing, however, requires a multimodal approach in which either the most appropriate modality is selected or the different modalities are used in collaborative fashion. Therefore, instead of separately treating the different information sources involved, and their specific algorithms, we focus on the similarities and differences between the modalities. To that end we put forward a unifying and multimodal framework, which views a video document from the perspective of its author. This framework forms the guiding principle for identifying index types, for which automatic methods are found in literature. It furthermore forms the basis for categorizing these different methods.

Grounded semantic composition for visual scenes

by Peter Gorniak, Deb Roy - Journal of Artificial Intelligence Research , 2004
"... We present a visually-grounded language understanding model based on a study of how people verbally describe objects in scenes. The emphasis of the model is on the combination of individual word meanings to produce meanings for complex referring expressions. The model has been implemented, and it is ..."
Abstract - Cited by 70 (21 self) - Add to MetaCart
We present a visually-grounded language understanding model based on a study of how people verbally describe objects in scenes. The emphasis of the model is on the combination of individual word meanings to produce meanings for complex referring expressions. The model has been implemented, and it is able to understand a broad range of spatial referring expressions. We describe our implementation of word level visually-grounded semantics and their embedding in a compositional parsing framework. The implemented system selects the correct referents in response to natural language expressions for a large percentage of test cases. In an analysis of the system’s successes and failures we reveal how visual context influences the semantics of utterances and propose future extensions to the model that take such context into account. 1.

Part-of-Speech Tagging Using Progol

by James Cussens - In Inductive Logic Programming: Proceedings of the 7th International Workshop (ILP-97). LNAI 1297 , 1997
"... . A system for `tagging' words with their part-of-speech (POS) tags is constructed. The system has two components: a lexicon containing the set of possible POS tags for a given word, and rules which use a word's context to eliminate possible tags for a word. The Inductive Logic Programming (ILP) sys ..."
Abstract - Cited by 43 (4 self) - Add to MetaCart
. A system for `tagging' words with their part-of-speech (POS) tags is constructed. The system has two components: a lexicon containing the set of possible POS tags for a given word, and rules which use a word's context to eliminate possible tags for a word. The Inductive Logic Programming (ILP) system Progol is used to induce these rules in the form of definite clauses. The final theory contained 885 clauses. For background knowledge, Progol uses a simple grammar, where the tags are terminals and predicates such as nounp (noun phrase) are nonterminals. Progol was altered to allow the caching of information about clauses generated during the induction process which greatly increased efficiency. The system achieved a per-word accuracy of 96.4% on known words drawn from sentences without quotation marks. This is on a par with other tagging systems induced from the same data [5, 2, 4] which all have accuracies in the range 96--97%. The per-sentence accuracy was 49.5%. 1 Introduction In p...

Tagging Romanian Texts: a Case Study for QTAG, a Language Independent Probabilistic Tagger

by Dan Tufis, Oliver Mason - Proceedings of the First International Conference on Language Resources and Evaluation (LREC , 1998
"... This paper describes an experiment on tagging Romanian using QTAG, a parts-of-speech tagger that has been developed originally for English, but with a clear separation between the (probabilistic) processing engine and the (language specific)resource data. This way, the tagger is usable across variou ..."
Abstract - Cited by 30 (3 self) - Add to MetaCart
This paper describes an experiment on tagging Romanian using QTAG, a parts-of-speech tagger that has been developed originally for English, but with a clear separation between the (probabilistic) processing engine and the (language specific)resource data. This way, the tagger is usable across various languages as shown by successful experiments on three quite different languages: English, Swedish and Romanian. After a brief presentation of the QTAG tagger, the paper dwells on language resources for Romanian and the evaluation of the results. A complexity metrics for tagging experiments is proposed which considers the performance of a tagger with respect to the "difficulty" of a text. Introduction Lexical ambiguity resolution is a key task in natural language processing (Baayen & Sproat, 1996). It can be regarded as a classification problem: an ambiguous lexical item is one that in different contexts can be classified differently and given a specified context the disambiguator /classi...

A Cascaded Finite-State Parser for Syntactic Analysis of Swedish

by Dimitrios Kokkinakis, Sofie Johansson Kokkinakis - In Proceedings of the 9th EACL
"... This report describes the development of a parsing system for written Swedish and is focused on a grammar, the main component of the system, semiautomatically extracted from corpora. A cascaded, finite-state algorithm is applied to the grammar in which the input contains coarse-grained semant ..."
Abstract - Cited by 13 (5 self) - Add to MetaCart
This report describes the development of a parsing system for written Swedish and is focused on a grammar, the main component of the system, semiautomatically extracted from corpora. A cascaded, finite-state algorithm is applied to the grammar in which the input contains coarse-grained semantic class information, and the output produced reflects not only the syntactic structure of the input, but grammatical functions as well. The grammar has been tested on a variety of random samples of different text genres, achieving precision and recall of 94.62% and 91.92% respectively, and average crossing rate of 0.04, when evaluated against manually disambiguated, annotated texts. 1 Introduction This report describes a parsing system for fast and accurate analysis of large bodies of written Swedish. The grammar has been implemented in a modular fashion as finite-state, cascaded machines, henceforth called Cass-SWE, a name adopted from the parser used, Cascaded analysis of synt...

Verb Frame Frequency as a Predictor of Verb Bias

by Maria Lapata, Frank Keller, Sabine Schulte Im Walde - Journal of Psycholinguistic Research , 2001
"... There is considerable evidence showing that the human sentence processor is guided by lexical preferences in resolving syntactic ambiguities. Several types of preferences have been identified, including morphological, syntactic, and semantic ones. However, the literature fails to provide a uniform a ..."
Abstract - Cited by 12 (2 self) - Add to MetaCart
There is considerable evidence showing that the human sentence processor is guided by lexical preferences in resolving syntactic ambiguities. Several types of preferences have been identified, including morphological, syntactic, and semantic ones. However, the literature fails to provide a uniform account of what lexical preferences are and how they should be measured. The present paper provides evidence for the view that lexical preferences are records of prior linguistic experience. We show that a type of lexial syntactic preference, viz., verb biases as measured by norming experiments, can be approximated by verb frame frequencies extracted from a large, balanced corpus by using computational learning techniques.

Automatic learning of textual entailments with cross-pair similarities

by Fabio Massimo Zanzotto - Proceedings of the 21st Coling and 44th ACL , 2006
"... In this paper we define a novel similarity measure between examples of textual entailments and we use it as a kernel function in Support Vector Machines (SVMs). This allows us to automatically learn the rewrite rules that describe a non trivial set of entailment cases. The experiments with the data ..."
Abstract - Cited by 12 (8 self) - Add to MetaCart
In this paper we define a novel similarity measure between examples of textual entailments and we use it as a kernel function in Support Vector Machines (SVMs). This allows us to automatically learn the rewrite rules that describe a non trivial set of entailment cases. The experiments with the data sets of the RTE 2005 challenge show an improvement of 4.4 % over the state-of-the-art methods. 1

Customizable Modular Lexicalized Parsing

by R. Basili, M. T. Pazienza, F.M. Zanzotto - In Proc. of the 6th International Workshop on Parsing Technology, IWPT2000 , 2000
"... Dierent NLP applications have dierent eciency constraints (i.e. quality of the results and throughput) that reect on each core linguistic component. Syntactic processors are basic modules in some NLP application. A customization that permits the performance control of these components enables thei ..."
Abstract - Cited by 11 (9 self) - Add to MetaCart
Dierent NLP applications have dierent eciency constraints (i.e. quality of the results and throughput) that reect on each core linguistic component. Syntactic processors are basic modules in some NLP application. A customization that permits the performance control of these components enables their reuse in dierent application scenarios. Throughput has been commonly improved using partial syntactic processors. On the other hand, specialized lexicons are generally employed to improve the quality of the syntactic material produced by speci c parsing (sub)process (e.g. verb argument detection or PPattachment disambiguation). Building upon the idea of grammar strati cation, in this paper a method to push modularity and lexical sensitivity, in parsing, in view of customizable syntactic analysers is presented. A framework for modular parser design is proposed and its main properties are discussed.

Language-Processing Strategies and Mixed-Initiative Dialogues

by Johan Boye , Mats Wirén, Manny Rayner, Ian Lewin, David Carter, Ralph Becket , 1999
"... We describe an implemented spoken-language dialogue system for a travel-planning domain, which accesses a commercially available travelinformation web-server and supports a flexible mixed-initiative dialogue strategy. We argue, based on data from initial Wizard-of-Oz experiments, that mixed-in ..."
Abstract - Cited by 11 (2 self) - Add to MetaCart
We describe an implemented spoken-language dialogue system for a travel-planning domain, which accesses a commercially available travelinformation web-server and supports a flexible mixed-initiative dialogue strategy. We argue, based on data from initial Wizard-of-Oz experiments, that mixed-initiative strategies are appropriate for many types of user, but require more sophisticated architectures for processing of language and dialogue; we then use these observations to motivate an architecture which combines parallel deep and shallow natural language analysis engines and an agenda-driven dialogue manager. We outline the top-level processing strategy used by the dialogue manager, and also a novel formalism, which we call Flat Utterance Description, that allows us to reduce the output of the deep and shallow languageprocessing engines to a common representation.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University