Results 1 - 10
of
47
Representing Text Chunks
, 1999
"... Dividing sentences in chunks of words is a useful preprocessing step for Parsing, information extraction and information retrieval. (Ramshaw and Marcus, 1995) have introduced a "convenient" data representation for chunking by converting it to a tagging task. In this paper we will examine seve ..."
Abstract
-
Cited by 62 (3 self)
- Add to MetaCart
Dividing sentences in chunks of words is a useful preprocessing step for Parsing, information extraction and information retrieval. (Ramshaw and Marcus, 1995) have introduced a "convenient" data representation for chunking by converting it to a tagging task. In this paper we will examine seven different data representations for the problem of recognizing noun phrase chunks. We will show that the the data representation choice has a minor influence on chunking performance. However,
Cascaded Grammatical Relation Assignment
, 1999
"... In this paper we discuss cascaded Memory-Based grammatical relations assignment. In the first stages of the cascade, we find chunks of several types (NP,VP,ADJPADVP,PP) and label them with their adverbial function (e.g. local, temporal). In the last stage, we assign grammatical relations to pairs of ..."
Abstract
-
Cited by 49 (14 self)
- Add to MetaCart
In this paper we discuss cascaded Memory-Based grammatical relations assignment. In the first stages of the cascade, we find chunks of several types (NP,VP,ADJPADVP,PP) and label them with their adverbial function (e.g. local, temporal). In the last stage, we assign grammatical relations to pairs of chunks. We studied the effect of adding several levels to this cascaded classifier and we found that even the less peribrining chunkors enhanced the performance of the relation finder.
Formal Ontology Engineering in the DOGMA Approach
, 2002
"... This paper presents a specifically database-inspired approach (called DOGMA) for engineering formal ontologies, implemented as shared resources used to express agreed formal semantics for a real world domain. We address several related key issues, such as knowledge reusability and shareability, sca ..."
Abstract
-
Cited by 27 (10 self)
- Add to MetaCart
This paper presents a specifically database-inspired approach (called DOGMA) for engineering formal ontologies, implemented as shared resources used to express agreed formal semantics for a real world domain. We address several related key issues, such as knowledge reusability and shareability, scalability of the ontology engineering process and methodology, efficient and effective ontology storage and management, and coexistence of heterogeneous rule systems that surround an ontology mediating between it and application agents. Ontologies should represent a domain's semantics independently from "language", while any process that creates elements of such an ontology must be entirely rooted in some (natural) language, and any use of it will necessarily be through a (in general an agent's computer) language. To achieve the claims stated, we explicitly decompose ontological resources into ontology bases in the form of simple binary facts called lexons and into socalled ontological commitments in the form of description rules and constraints. Ontology bases in a logic sense, become "representationless " mathematical objects which constitute the range of a classical interpretation mapping from a first order language, assumed to lexically represent the commitment or binding of an application or task to such an ontology base. Implementations of ontologies become database-like on-line resources in the model-theoretic sense. The resulting architecture allows to materialize the (crucial) notion of commitment as a separate layer of (software agent) services, mediating between the ontology base and those application instances that commit to the ontology. We claim it also leads to methodological approaches that naturally extend key aspects of database modeling theory and practice. We discuss examples of the prototype DOGMA implementation of the ontology base server and commitment server.
Shallow Parsing Using Specialized HMMs
- Journal of Machine Learning Research
, 2002
"... We present a unified technique to solve di#erent shallow parsing tasks as a tagging problem using a Hidden Markov Model-based approach (HMM). This technique consists of the incorporation of the relevant information for each task into the models. To do this, the training corpus is transformed to t ..."
Abstract
-
Cited by 26 (5 self)
- Add to MetaCart
We present a unified technique to solve di#erent shallow parsing tasks as a tagging problem using a Hidden Markov Model-based approach (HMM). This technique consists of the incorporation of the relevant information for each task into the models. To do this, the training corpus is transformed to take into account this information. In this way, no change is necessary for either the training or tagging process, so it allows for the use of a standard HMM approach. Taking into account this information, we construct a Specialized HMM which gives more complete contextual models. We have tested our system on chunking and clause identification tasks using di#erent specialization criteria. The results obtained are in line with the results reported for most of the relevant state-of-the-art approaches.
Incrementality in deterministic dependency parsing
- In Proceedings of the Workshop on Incremental Parsing (ACL
, 2004
"... Deterministic dependency parsing is a robust and efficient approach to syntactic parsing of unrestricted natural language text. In this paper, we analyze its potential for incremental processing and conclude that strict incrementality is not achievable within this framework. However, we also show th ..."
Abstract
-
Cited by 25 (6 self)
- Add to MetaCart
Deterministic dependency parsing is a robust and efficient approach to syntactic parsing of unrestricted natural language text. In this paper, we analyze its potential for incremental processing and conclude that strict incrementality is not achievable within this framework. However, we also show that it is possible to minimize the number of structures that require nonincremental processing by choosing an optimal parsing algorithm. This claim is substantiated with experimental evidence showing that the algorithm achieves incremental parsing for 68.9% of the input when tested on a random sample of Swedish text. When restricted to sentences that are accepted by the parser, the degree of incrementality increases to 87.9%. 1
Conceptual graph matching for semantic search
- In ICCS
, 2002
"... Abstract. Semantic search becomes a research hotspot. The combined use of linguistic ontologies and structured semantic matching is one of the promising ways to improve both recall and precision. In this paper, we propose an approach for semantic search by matching conceptual graphs. The detailed de ..."
Abstract
-
Cited by 24 (0 self)
- Add to MetaCart
Abstract. Semantic search becomes a research hotspot. The combined use of linguistic ontologies and structured semantic matching is one of the promising ways to improve both recall and precision. In this paper, we propose an approach for semantic search by matching conceptual graphs. The detailed definitions of semantic similarities between concepts, relations and conceptual graphs are given. According to these definitions of semantic similarity, we propose our conceptual graph matching algorithm that calculates the semantic similarity. The computation complexity of this algorithm is constrained to be polynomial. A prototype of our approach is currently under development with IBM China Research Lab. 1.
Ontologies and Databases: More than a Fleeting Resemblance
, 2001
"... Formal ontologies can be seen as mathematical objects that form the range of a classical "Tarskian" semantics interpretation mapping of first-order language constructs that could represent situations, functions or procedures related to a given domain. Some design methods and techniques such as view ..."
Abstract
-
Cited by 23 (5 self)
- Add to MetaCart
Formal ontologies can be seen as mathematical objects that form the range of a classical "Tarskian" semantics interpretation mapping of first-order language constructs that could represent situations, functions or procedures related to a given domain. Some design methods and techniques such as view integration that were originally developed for large databases, where the "data models" and their semantics typically are limited to a particular application, could be relevant for this purpose, and we analyze parallelisms and fundamental differences between databases and ontologies. In particular the ORM (Ob jectRole Modeling) method, or rather its precursor NIAM, with its rigorous distinction and handling of so-called lexical and non-lexical knowledge proved to be an interesting candidate to help identify and clarify a number of these issues, and a number of examples are given. We report on research within STARLab's DOGMA Project that indicates how such methods may or must be adapted to be usable in the context of ontologies, and how they may then help to define ontology updates, the role of domain constraints, and future tools that assist in e.g. the alignment of ontologies.
Mining for Lexons: Applying Unsupervised Learning Methods to Create Ontology Bases
- In Proceedings of the International Conference on Ontologies, Databases and Applications of Semantics (ODBASE
, 2003
"... Ontologies in current computer science parlance are computer based resources that represent agreed domain semantics. This paper first introduces ontologies in general and subsequently, in particular, shortly outlines the DOGMA ontology engineering approach that separates "atomic" conceptual relat ..."
Abstract
-
Cited by 14 (4 self)
- Add to MetaCart
Ontologies in current computer science parlance are computer based resources that represent agreed domain semantics. This paper first introduces ontologies in general and subsequently, in particular, shortly outlines the DOGMA ontology engineering approach that separates "atomic" conceptual relations from "predicative" domain rules. In the main part of the paper, we describe and experimentally evaluate work in progress on a potential method to automatically derive the atomic conceptual relations mentioned above from a corpus of English medical texts. Preliminary outcomes are presented based on the clustering of nouns and compound nouns according to co-occurrence frequencies in the subject-verbobject syntactic context.
FLaVoR: a Flexible Architecture for LVCSR
- In Proc. European Conference on Speech Communication and Technology
, 2003
"... This paper describes a new architecture for large vocabulary continuous speech recognition (LVCSR), which will be developed within the project FLaVoR (Flexible Large Vocabulary Recognition). The proposed architecture abandons the standard all-in-one search strategy with integrated acoustic, lexical ..."
Abstract
-
Cited by 14 (11 self)
- Add to MetaCart
This paper describes a new architecture for large vocabulary continuous speech recognition (LVCSR), which will be developed within the project FLaVoR (Flexible Large Vocabulary Recognition). The proposed architecture abandons the standard all-in-one search strategy with integrated acoustic, lexical and language model information. Instead, a modular framework is proposed which allows for the integration of more complex linguistic components. The search process consists of two layers. First, a pure acoustic-phonemic search generates a dense phoneme network enriched with meta-data. Then, the output of the first layer is used by sophisticated language technology components for word decoding in the second layer. Preliminary experiments prove the feasibility of the approach.

