Results 1 -
6 of
6
Automatic Discovery of Similar Words
, 2003
"... We deal with the issue of automatic discovery of similar words (synonyms and near-synonyms) from different kind of sources: from a large corpora of documents, from the Web, and from monolingual dictionaries. We present in detail three algorithms that extract similar words from a large corpus of docu ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
We deal with the issue of automatic discovery of similar words (synonyms and near-synonyms) from different kind of sources: from a large corpora of documents, from the Web, and from monolingual dictionaries. We present in detail three algorithms that extract similar words from a large corpus of documents and consider the specific case of the World Wide Web. We then describe a recent method of automatic synonym extraction in a monolingual dictionary. The method is based on an algorithm that computes similarity measures between vertices in graphs. We use the 1913 Webster's Dictionary and apply the method on four synonym queries. The results obtained are analyzed and compared with those obtained with two other methods.
Grouping Synonyms by Definitions
- in "Recent Advances in Natural Language Processing (RANLP), Bulgarie Borovets", R. MITKOV (editor), University of Wolverhampton, UK Institute for Parallel Processing, BAS, Bulgaria Incoma Ltd, Shoumen
"... We present a method for grouping the synonyms of a lemma according to its dictionary senses. The senses are defined by a large machine readable dictionary for French, the TLFi (Trésor de la langue française informatisé) and the synonyms are given by 5 synonym dictionaries (also for French). To evalu ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We present a method for grouping the synonyms of a lemma according to its dictionary senses. The senses are defined by a large machine readable dictionary for French, the TLFi (Trésor de la langue française informatisé) and the synonyms are given by 5 synonym dictionaries (also for French). To evaluate the proposed method, we manually constructed a gold standard where for each (word, definition) pair and given the set of synonyms defined for that word by the 5 synonym dictionaries, 4 lexicographers specified the set of synonyms they judge adequate. While inter-annotator agreement ranges on that task from 67 % to at best 88 % depending on the annotator pair and on the synonym dictionary being considered, the automatic procedure we propose scores a precision of 67 % and a recall of 71%. The proposed method is compared with related work namely, word sense disambiguation, synonym lexicon acquisition and WordNet construction.
Language Resources and Evaluation (LREC), 2002. H. Baayen. Word frequency distributions. Kluwer Academic Publishers, 2001.
"... Linguistics (ACL), 2005. ..."
Automatic Lexico-Semantic Acquisition for Question Answering
"... organisation for scientific research. The work in this thesis has been carried out under the auspices ..."
Abstract
- Add to MetaCart
organisation for scientific research. The work in this thesis has been carried out under the auspices
Thesaurus Extension using Web Search Engines
"... Abstract. Maintaining and extending large thesauri is an important challenge facing digital libraries and IT businesses alike. In this paper we describe a method building on and extending existing methods from the areas of thesaurus maintenance, natural language processing, and machine learning to ( ..."
Abstract
- Add to MetaCart
Abstract. Maintaining and extending large thesauri is an important challenge facing digital libraries and IT businesses alike. In this paper we describe a method building on and extending existing methods from the areas of thesaurus maintenance, natural language processing, and machine learning to (a) extract a set of novel candidate concepts from text corpora and (b) to generate a small ranked list of suggestions for the position of these concept in an existing thesaurus. Based on a modification of the standard tf-idf term weighting we extract relevant concept candidates from a document corpus. We then apply a pattern-based machine learning approach on content extracted from web search engine snippets to determine the type of relation between the candidate terms and existing thesaurus concepts. The approach is evaluated with a largescale experiment using the MeSH and WordNet thesauri as testbed. 1
Paraphrase Alignment for Synonym Evidence Discovery Gintar˙e Grigonyt˙e
"... We describe a new unsupervised approach for synonymy discovery by aligning paraphrases in monolingual domain corpora. For that purpose, we identify phrasal terms that convey most of the concepts within domains and adapt a methodology for the automatic extraction and alignment of paraphrases to ident ..."
Abstract
- Add to MetaCart
We describe a new unsupervised approach for synonymy discovery by aligning paraphrases in monolingual domain corpora. For that purpose, we identify phrasal terms that convey most of the concepts within domains and adapt a methodology for the automatic extraction and alignment of paraphrases to identify paraphrase casts from which valid synonyms are discovered. Results performed on two different domain corpora show that general synonyms as well as synonymic expressions can be identified with a 67.27% precision. 1

