Results 1 -
4 of
4
Syntactic contexts for finding semantically related words
- In CLIN
"... Finding semantically related words is a first step in the direction of automatic ontology building. Guided by the view that similar words occur in similar contexts, we looked at the syntactic context of words to measure their semantic similarity. Words that occur in a direct object relation with the ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Finding semantically related words is a first step in the direction of automatic ontology building. Guided by the view that similar words occur in similar contexts, we looked at the syntactic context of words to measure their semantic similarity. Words that occur in a direct object relation with the verb drink, for instance, have something in common (liquidity,...). Co-occurrence data for common nouns and proper names, for several syntactic relations, was collected from an automatically parsed corpus of 78 million words of newspaper text. We used several vector-based methods to compute the distributional similarity between words. Using Dutch EuroWordNet as evaluation standard, we investigated which vector-based method and which combination of syntactic relations is the strongest predictor of semantic similarity. 1
Mining Syntactically Annotated Corpora with XQuery
"... This paper presents a uniform approach to data extraction from syntactically annotated corpora encoded in XML. XQuery, which incorporates XPath, has been designed as a query language for XML. The combination of XPath and XQuery offers flexibility and expressive power, while corpus specific functions ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This paper presents a uniform approach to data extraction from syntactically annotated corpora encoded in XML. XQuery, which incorporates XPath, has been designed as a query language for XML. The combination of XPath and XQuery offers flexibility and expressive power, while corpus specific functions can be added to reduce the complexity of individual extraction tasks. We illustrate our approach using examples from dependency treebanks for Dutch. 1
dr. G.J.M. van Noord Alfa-informatica
"... A large corpus of written Dutch texts (1,000,000 words) is syntactically annotated (manually corrected), based on D-COI. In addition, the full D-COI corpus is syntactically annotated automatically. The project aims to extend the available syntactically annotated corpora for Dutch both in size as wel ..."
Abstract
- Add to MetaCart
A large corpus of written Dutch texts (1,000,000 words) is syntactically annotated (manually corrected), based on D-COI. In addition, the full D-COI corpus is syntactically annotated automatically. The project aims to extend the available syntactically annotated corpora for Dutch both in size as well as with respect to the various text genres and topical domains. In addition, various browse and search tools for syntactically annotated corpora will be further developed and made available. Their potential for applications in corpus linguistics and information extraction will be illustrated and evaluated.
Extraction of Hypernymy Information from Text ∗
"... Abstract We present the results of three different studies in extracting hypernymy information from text. In the first, we compare a method based on a single extraction pattern applied to the web with a set of patterns applied to a big corpus. In the second study, we examine how relation extraction ..."
Abstract
- Add to MetaCart
Abstract We present the results of three different studies in extracting hypernymy information from text. In the first, we compare a method based on a single extraction pattern applied to the web with a set of patterns applied to a big corpus. In the second study, we examine how relation extraction can be performed reliably from a text without having access to a word sense tagger. And in a third experiment, we check what the effect of elaborate syntactic information is on the extraction process. We find that both using more data and the removal of ambiguities from the training data are beneficial to the extraction process. But to our surprise we were unable to find a positive effect of additional syntactic information. 1

