Results 1 - 10
of
11
Acquiring Lexical Knowledge For Anaphora Resolution
- In Proceedings of the 3rd Conference on Language Resources and Evaluation (LREC
, 2002
"... The lack of adequate bases of commonsense or even lexical knowledge is perhaps the main obstacle to the development of highperformance, robust tools for semantic interpretation. It is also generally accepted that, notwithstanding the increasing availability in recent years of substantial hand-coded ..."
Abstract
-
Cited by 43 (8 self)
- Add to MetaCart
The lack of adequate bases of commonsense or even lexical knowledge is perhaps the main obstacle to the development of highperformance, robust tools for semantic interpretation. It is also generally accepted that, notwithstanding the increasing availability in recent years of substantial hand-coded lexical resources such as WordNet and EuroWordNet, addressing the commonsense knowledge bottleneck will eventually require the development of effective techniques for acquiring such information automatically, e.g., from corpora. We discuss research aimed at improving the performance of anaphora resolution systems by acquiring the commonsense knowledge require to resolve the more complex cases of anaphora, such as bridging references. We focus in particular on the problem of acquiring information about part-of relations.
Evaluation Techniques for Automatic Semantic Extraction: Comparing Syntactic and Window Based Approaches
, 1993
"... As large on-line corpora become more prewlent, a number of attempts have been made to automatically extract thesaurus-like relations directly from text using knowledge poor methods. In the absence of any specific application, comparing the results of these attempts is difficult. Here we propose an e ..."
Abstract
-
Cited by 42 (0 self)
- Add to MetaCart
As large on-line corpora become more prewlent, a number of attempts have been made to automatically extract thesaurus-like relations directly from text using knowledge poor methods. In the absence of any specific application, comparing the results of these attempts is difficult. Here we propose an ewluation method using gold standards, i.e., pre-existing hand-compiled resources, as a means of comparing extraction techniques. Using this ewluation method, we compare two semantic extraction techniques which produce similar word lists, one using syntactic context of words , and the'other using windows of heuristiclly tagged words. The two techniques are very similar except that in one case selective natural language processing, a partial syntactic analysis, is performed. On a 4 megabyte corpus, syntactic contexts produce significantly better results against the gold standards for the most characteristic words in the corpus, while windows produce better results for rare words.
Determining the Specificity of Nouns From Text
, 1999
"... In this work, we use a large text corpus to order nouns by. their level of specificity. This semantic information can for most nouns be determined with over 80% accuracy using simple statistics from a text corpus with- out using any additional sources of seman- tic knowledge.. This kind of semantic ..."
Abstract
-
Cited by 26 (0 self)
- Add to MetaCart
In this work, we use a large text corpus to order nouns by. their level of specificity. This semantic information can for most nouns be determined with over 80% accuracy using simple statistics from a text corpus with- out using any additional sources of seman- tic knowledge.. This kind of semantic information can be used to help in automatically constructing or augmenting a lexical database such as WordNet.
M.: Attribute-based and value-based clustering: An evaluation
- In: EMNLP ’04, ACL
, 2004
"... In most research on concept acquisition from corpora, concepts are modeled as vectors of relations extracted from syntactic structures. In the case of modifiers, these relations often specify values of attributes, as in (attr red); this is unlike what typically proposed in theories of knowledge repr ..."
Abstract
-
Cited by 22 (4 self)
- Add to MetaCart
In most research on concept acquisition from corpora, concepts are modeled as vectors of relations extracted from syntactic structures. In the case of modifiers, these relations often specify values of attributes, as in (attr red); this is unlike what typically proposed in theories of knowledge representation, where concepts are typically defined in terms of their attributes (e.g., color). We compared models of concepts based on values with models based on attributes, using lexical clustering as the basis for comparison. We find that attribute-based models work better than value-based ones, and result in shorter descriptions; but that mixed models including both the best attributes and the best values work best of all. 1
Acquiring word-meaning mappings for natural language interfaces
- Journal of Artificial Intelligence Research
, 2003
"... This paper focuses on a system, Wolfie (WOrd Learning From Interpreted Examples), that acquires a semantic lexicon from a corpus of sentences paired with semantic representations. The lexicon learned consists of phrases paired with meaning representations. Wolfie is part of an integrated system that ..."
Abstract
-
Cited by 21 (7 self)
- Add to MetaCart
This paper focuses on a system, Wolfie (WOrd Learning From Interpreted Examples), that acquires a semantic lexicon from a corpus of sentences paired with semantic representations. The lexicon learned consists of phrases paired with meaning representations. Wolfie is part of an integrated system that learns to parse representations such as logical database queries. Experimental results are presented demonstrating Wolfie’s ability to learn useful lexicons for a database interface in four different natural languages. The usefulness of the lexicons learned by Wolfie are compared to those acquired by a similar system developed by Siskind (1996), with results favorable to Wolfie. A second set of experiments demonstrates Wolfie’s ability to scale to larger and more difficult, albeit artificially generated, corpora. In natural language acquisition, it is difficult to gather the annotated data needed for supervised learning; however, unannotated data is fairly plentiful. Active learning methods (Cohn, Atlas, & Ladner, 1994) attempt to select for annotation and training only the most informative examples, and therefore are potentially very useful in natural language applications. However, most results to date for active learning have only considered standard classification tasks. To reduce annotation effort while maintaining accuracy, we apply active learning to semantic lexicons. We show that active learning can significantly reduce the number of annotated examples required to achieve a given level of performance. 1.
Sextant: Exploring Unexplored Contexts For Semantic Extraction from Syntactic Analysis
- in Proceedings of the 30st annual meeting of the Association for Computational Linguistics, ACL
, 1992
"... For a very long time, it has been considered that the only way of automatically extracting similar groups of words from a text collection for which no semantic information exists is to use document co-occurrence data. But, with robust syntactic parsers that are becoming more frequently available, sy ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
For a very long time, it has been considered that the only way of automatically extracting similar groups of words from a text collection for which no semantic information exists is to use document co-occurrence data. But, with robust syntactic parsers that are becoming more frequently available, syntactically recognizable phenomena about word usage can be confidently noted in large collections of texts. We present here a new system called SEXTANT which uses these parsers and the finer-grained contexts they produce to judge word similarity.
A Method for Refining Automatically-Discovered Lexical Relations: Combining Weak Techniques for Stronger Results
- Statisticallybased natural language programming techniques, Proc. AAAI Workshop, AAAI Press, Menlo Park, CA
, 1992
"... Knowledge-poor corpus-based approaches to natural language processing are attractive in that they do not incur the difficulties associated with complex knowledge bases and real-world inferences. However, these kinds of language processing techniques in isolation often do not suffice for a particular ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Knowledge-poor corpus-based approaches to natural language processing are attractive in that they do not incur the difficulties associated with complex knowledge bases and real-world inferences. However, these kinds of language processing techniques in isolation often do not suffice for a particular task; for this reason we are interested in finding ways to combine various techniques and improve their results. Accordingly, we conducted experiments to refine the results of an automatic lexical discovery technique by making use of a statistically-based syntactic similarity measure. The discovery program uses lexico-syntactic patterns to find instances of the hyponymy relation in large text bases. Once relations of this sort are found, they should be inserted into an existing lexicon or thesaurus. However, the terms in the relation may have multiple senses, thus hampering automatic placement. In order to address this problem we tried to make a term-similarity determination technique choos...
Finding Attributes in the Web Using a Parser
- In Proceedings of Corpus Linguistics
, 2005
"... In previous work, we found that a great deal of information about noun attributes can be extracted from the Web using simple text patterns, and that enriching vector-based models of concepts with this information about attributes led to drastic improvements in noun categorization. We extend this pre ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
In previous work, we found that a great deal of information about noun attributes can be extracted from the Web using simple text patterns, and that enriching vector-based models of concepts with this information about attributes led to drastic improvements in noun categorization. We extend this previous work by comparing concept descriptions extracted using patterns with descriptions extracted with a parser. Our results show that it is computationally more efficient to use simple text patterns than parsing text. 1
Domain modelling and NLP: Formal Ontologies? Lexica? Or a Bit of Both?
, 2005
"... There are a number of genuinely open questions concerning the use of domain models in nlp. It would be great if contributors to Applied Ontology could help addressing them rather than adding to an already long polemical literature... 1 Empiricists vs. Formalists All Over Again? In virtually every wo ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
There are a number of genuinely open questions concerning the use of domain models in nlp. It would be great if contributors to Applied Ontology could help addressing them rather than adding to an already long polemical literature... 1 Empiricists vs. Formalists All Over Again? In virtually every workshop on ontologies, terminology, or lexical acquisition I attended in the last years at nlp events I found myself watching (or getting involved in) fierce debates as to which approach to domain categorization is ‘best’: designing a clean, elegant ontology with a clear semantics and based on sound philosophical principles and / or scientific evidence; or relying on evidence from psychology and corpora, and on machine learning techniques, to acquire (automatically, as far as possible) a domain structure that in most cases will be rather messy. (The opposite sides of the argument are presented in (Wilks, 2002) and (Smith, 2004).) Are we back to the bad old days of the ‘neat’

