Results 1 - 10
of
16
Natural Language Processing for Information Assurance and Security: An Overview and Implementations
- In Proceedings 9th ACM/SIGSAC New Security Paradigms Workshop
, 2000
"... This research paper explores a promising interface between natural language processing (NLP) and information assurance and security (IAS). More specifically, it is devoted to possible applications to, and further dedicated development of, the accumulated considerable resources in NLP for, IAS. The e ..."
Abstract
-
Cited by 27 (10 self)
- Add to MetaCart
This research paper explores a promising interface between natural language processing (NLP) and information assurance and security (IAS). More specifically, it is devoted to possible applications to, and further dedicated development of, the accumulated considerable resources in NLP for, IAS. The expected and partially accomplished result is in harnessing the weird, illogical ways natural languages encode meaning, the very ways that defy all the usual combinatorial approaches to mathematical--and computational--complexity and make NLP so hard, to enhance information security. The paper is of a mixed theoretical and empirical nature. Of the four possible venues of
Universal Grammar and Lexis for Quick Ramp-Up of MT Systems
, 1998
"... This paper introduces Boas, a semi-automatic knowledge elicitation system that guides a team of two people through the process of developing the static knowledge sources for a moderate-quality, broad-coverage MT system from any "low-density " language into English in about six months. The paper focu ..."
Abstract
-
Cited by 21 (9 self)
- Add to MetaCart
This paper introduces Boas, a semi-automatic knowledge elicitation system that guides a team of two people through the process of developing the static knowledge sources for a moderate-quality, broad-coverage MT system from any "low-density " language into English in about six months. The paper focuses on some issues in the elicitation of descriptive knowledge in Boas and also the issue of the principled reuse of pre-existing resources, such as a lexicon, an ontology, and an English generation module, among others, made possible by the fact that the client MT system is developed for a single target language.
Inducing criteria for mass noun lexical mappings using the Cyc KB, and its extension to WordNet
- In Proc. of the Fifth International Workshop on Computational Semantics (IWCS-5
, 2003
"... This paper presents an automatic approach for learning semantic criteria for the mass versus count noun distinction by induction over the lexical mappings contained in the Cyc knowledge base. This produces accurate results (89.5%) using a decision tree that only incorporates semantic features (i ..."
Abstract
-
Cited by 12 (4 self)
- Add to MetaCart
This paper presents an automatic approach for learning semantic criteria for the mass versus count noun distinction by induction over the lexical mappings contained in the Cyc knowledge base. This produces accurate results (89.5%) using a decision tree that only incorporates semantic features (i.e., Cyc ontological types). Comparable results (86.9%) are obtained using OpenCyc, the publicly available version of Cyc. For broader applicability, the mass noun criteria using Cyc are converted into criteria using WordNet, preserving the general accuracy (86.3%).
If you have it, flaunt it: Using full ontological knowledge for word sense disambiguation
- In Proceedings of the 7th International Conference on Theoretical and Methodological Issues in Machine Translation
, 1997
"... Abstract. Word sense disambiguation continues to be a difficult problem in natural language pro-cessing. Current methods, such as marker passing and spreading activation, for applying world knowledge in the form of selectional preferences to solve this problem do not make effective use of available ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
Abstract. Word sense disambiguation continues to be a difficult problem in natural language pro-cessing. Current methods, such as marker passing and spreading activation, for applying world knowledge in the form of selectional preferences to solve this problem do not make effective use of available knowledge. Moreover, their effectiveness decreases as the knowledge is made richer by acquiring more and more conceptual relationships. Effective resolution of word sense ambiguities requires inferring the dynamic context in processing a sentence in order to find the right selectional preferences to be applied. In this article, we propose such an inference operator and show how it finds the most specific context to resolve word sense ambiguities in the Mikrokosmos semantic ana-lyzer. Our method retains its effectiveness even in a rich, large-scale knowledge base with a high degree of connectivity among its concepts. 1. Disambiguation in Context Word sense disambiguation continues to be a difficult problem for programs that process natural language. The goals of word sense resolution methods are: (a) to select as small a subset of possible senses of a word as possible, ideally just one sense, and (b) to select the best sense(s) given all the knowledge available to the system, including the dynamic context in processing the text. The most common methods for resolving word sense ambiguities are based on statistical collocations or selectional preferences (for a recent survey, see Guthrie et al, 1996) between pairs of word senses. Often, individual selectional preferences applicable to a word are not strong enough to exclude all but one sense of the word. The real power of word sense selection seems to lie in the ability to constrain the possible senses of a word based on selections made for other words in the dynamic context. Although it is a truism that context plays a significant role in sense disambiguation, computational models have not demonstrated the effectiveness of modeling context for resolving word senses in a large-scale NLP system.
Word Sense Disambiguation: Why Statistics When We Have These Numbers?
- Proceedings of the 7th International Conference on Theoretical and Methodological Issues in Machine Translation
, 1997
"... . Word sense disambiguation continues to be a di#cult problem in machine translation #MT#. Current methods either demand large amounts of corpus data and training or rely on knowledge of hard selectional constraints. In either case, the methods have been demonstrated only on a small scale and mos ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
. Word sense disambiguation continues to be a di#cult problem in machine translation #MT#. Current methods either demand large amounts of corpus data and training or rely on knowledge of hard selectional constraints. In either case, the methods have been demonstrated only on a small scale and mostly in isolation, where disambiguation is a task by itself. It is not clear that the methods can be scaled up and integrated with other components of analysis and generation that constitute an end-to-end MT system. In this paper, we illustrate how the Mikrokosmos Knowledge-Based MT system disambiguates word senses in real-world texts with a very high degree of correctness. Disambiguation in Mikrokosmos is achieved by a combination of #i# a broad-coverage ontology with many selectional constraints per concept, #ii# a large computational-semantic lexicon grounded in the ontology, #iii# an optimized search algorithm for checking selectional constraints in the ontology, and #iv# an e#cie...
Two principles and six techniques for rapid mt development
- Proc. of AMTA-96
, 1996
"... In this paper we describe a range of techniques used at NMSU CRL for accelerating the development of MT systems. These techniques enable semi-automatic development of a number of components of a multilingual MT system, thereby enabling rapid deployment of MT capabilities in a new language. First, we ..."
Abstract
-
Cited by 6 (5 self)
- Add to MetaCart
In this paper we describe a range of techniques used at NMSU CRL for accelerating the development of MT systems. These techniques enable semi-automatic development of a number of components of a multilingual MT system, thereby enabling rapid deployment of MT capabilities in a new language. First, we describe the core multi-engine, multilingual architecture that enables the different techniques to be rapidly integrated to build an MT system. We show how off-the-shelf components were used in this architecture for fast development. Then we illustrate a set of techniques for semi-automatic acquisition of static resources: (a) automatic induction of grammars, (b) corpus-based acquisition of bilingual glossaries, and automatic acquisition of semantic lexicons through (c) lexical rules and (d) reversal of analysis lexicons to generation lexicons. Finally we describe an automatic testing environment that enables rapid validation of automatically acquired resources. 1 Rapid Development Techniques Static knowledge sources — grammars, lexicons, world knowledge bases — are the most time-consuming concerns in any rule-based machine translation system. It is, therefore, imperative to find ways of speeding up the creation and updating of high-quality, useful static knowledge sources. It is equally imperative to
Incremental learning of transfer rules for customized machine translation
- Proc. of the 15th Intl. Conf. on Applications of Declarative Programming and Knowledge Management
, 2005
"... Abstract. In this paper we present a machine translation system, which translates Japanese into German. We have developed a transfer-based architecture in which the transfer rules are learnt incrementally from translation examples provided by a user. This means that there are no handcrafted rules, b ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
Abstract. In this paper we present a machine translation system, which translates Japanese into German. We have developed a transfer-based architecture in which the transfer rules are learnt incrementally from translation examples provided by a user. This means that there are no handcrafted rules, but, on the contrary, the user can customize the system according to his own preferences. The translation system has been implemented by using Amzi! Prolog. This programming environment had the big advantage of offering sufficient scalability even for large lexicons and rule bases, powerful unification operations for the application of transfer rules, and full Unicode support for Japanese characters. Finally, the application programming interface to Visual Basic made it possible to design an embedded translation environment so that the user can use Microsoft Word to work with the Japanese text and invoke the translation features directly from within the text editor. We have integrated the machine translation system into a language learning environment for German-speaking language students to create a Personal Embedded Translation and Reading Assistant (PETRA). 1
Indirect anaphora resolution as semantic path search
- In K-CAP ’05: Proceedings of the 3rd international conference on Knowledge capture
, 2005
"... Anaphora occur commonly in natural language text, and resolving them is essential for capturing the knowledge encoded in text. Indirect anaphora are especially challenging to resolve because the referring expression and the antecedent are related by unstated background knowledge. Such anaphora need ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Anaphora occur commonly in natural language text, and resolving them is essential for capturing the knowledge encoded in text. Indirect anaphora are especially challenging to resolve because the referring expression and the antecedent are related by unstated background knowledge. Such anaphora need to be resolved properly in order to automatically capture the knowledge expressed in natural language. Resolving indirect anaphora has been treated as a unique problem that requires special-purpose methods, and these methods have had limited success in precision and recall. In this study, we used a generic tool for finding semantic paths between two concepts to resolve these anaphora, and it achieved approximately twice the recall of the best previous system without loss of precision. A series of ablation study showed that the biggest increase in recall came from an abductive stopping criterion of the search.
An Ontological-Semantic Framework for Text Analysis
, 1997
"... The Knowledge-Based Machine Translation paradigm requires a comprehensive analysis of input texts into an unambiguous machine-tractable representation of the propositional and meta-propositional meaning of that text, for which we use a particular framework referred to as ontological semantics. Th ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
The Knowledge-Based Machine Translation paradigm requires a comprehensive analysis of input texts into an unambiguous machine-tractable representation of the propositional and meta-propositional meaning of that text, for which we use a particular framework referred to as ontological semantics. The work presented here begins with a definition of a representation language for lexical semantic specification (and syntax/semantics interface) to support such an analysis, as well as a generalized algorithm for building the meaning representation from these lexical semantic specifications, utilizing the ontology and a syntactic parse as knowledge sources. The core of the algorithm is an algorithm for semantic constraint satisfaction and relaxation, involving finding the best path over the ontology between a candidate filler of a relation and semantic constraints on that relation. The ontology is viewed as a multi-dimensional graph, with distinct topologies in each dimension reflecting specific semantic relations between nodes (representing concepts) , where weights or arc distance reflects strength of semantic relatedness in context (where the path-so-far context is maintained in a state transition table).
Empirical Acquisition of Conceptual Distinctions via Dictionary Definitions
, 2004
"... This thesis discusses the automatic acquisition of conceptual distinctions using empirical methods, with an emphasis on semantic relations. The goal is to improve semantic lexicons for computational linguistics, but the work can be applied to general-purpose knowledge bases as well. ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
This thesis discusses the automatic acquisition of conceptual distinctions using empirical methods, with an emphasis on semantic relations. The goal is to improve semantic lexicons for computational linguistics, but the work can be applied to general-purpose knowledge bases as well.

