Results 1 -
7 of
7
Expansion of Multi-Word Terms for Indexing and Retrieval Using Morphology and Syntax
- In proceedings of the 35th Annual Meeting of the ACL
, 1997
"... A system for the automatic production of controlled index terms is presented using linguistically-motivated techniques. This includes a finite-state part of speech tagger, a derivational morphological processor for analysis and generation, and a unificationbased shallow-level parser using tran ..."
Abstract
-
Cited by 33 (7 self)
- Add to MetaCart
A system for the automatic production of controlled index terms is presented using linguistically-motivated techniques. This includes a finite-state part of speech tagger, a derivational morphological processor for analysis and generation, and a unificationbased shallow-level parser using transformational rules over syntactic patterns. The contribution of this research is the success- ful combination of parsing over a seed term list coupled with derivational morphology to achieve greater coverage of multi-word terms for indexing and retrieval. Final results are evaluated for precision and recall, and implications for indexing and retrieval are discussed.
Empirical Observation of Term Variations and Principles for their Description
, 2000
"... Contents 1 Introduction 2 1.1 Do terms vary? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 A Symbolic Framework for the Study of Terminological Variation . . . . . . . . . . . . . . . 4 2 The Most Common Types of English Two-word Terms 7 2.1 Adjective N ..."
Abstract
-
Cited by 23 (0 self)
- Add to MetaCart
Contents 1 Introduction 2 1.1 Do terms vary? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 A Symbolic Framework for the Study of Terminological Variation . . . . . . . . . . . . . . . 4 2 The Most Common Types of English Two-word Terms 7 2.1 Adjective Noun (A N) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 Noun Noun (N 2 N 1 ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3 Noun Preposition Noun (N 1 P N 2 ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3 Observing and Representing Term Variants 9 3.1 An Observation of Term Variants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.2 A Two-level Lexico-syntactic Description of Terms . . . . . . . . . . . . . . . . . . . . . . . 11 3.3 Two Families of Grammatical Rules . .
NLP for Term Variant Extraction: Synergy between Morphology, Lexicon, and Syntax
, 1999
"... . We present a natural language processing (NLP) approach to automatic indexing over controlled vocabulary which accounts for term variation. The approach combines a part of speech tagger, a generator of morphologically related forms, and a shallow transformational parser. The system is applied to t ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
. We present a natural language processing (NLP) approach to automatic indexing over controlled vocabulary which accounts for term variation. The approach combines a part of speech tagger, a generator of morphologically related forms, and a shallow transformational parser. The system is applied to the French language; it is trained on newspaper articles and tested on scientific literature. Precision rate of indexing on term and variants is 97.2%. It is only slightly lower than indexing without accounting for term variation (99.7%). Recall rate of indexing on term and variants (93.4%) is much higher than recall of indexing on term occurrences only (72.4%). Conflation of term variants increases indexing coverage up to 30%. The system is a convincing example of the potential synergy between full-fledged morphological analysis and local syntactic analysis. Many details are provided on the implementation of the system. Illustrative examples of syntactic transformations for the French language are given together with the theoretical and empirical methods for their formulation. 2 CHRISTIAN JACQUEMIN AND EVELYNE TZOUKERMANN 1.
What Is The Tree That We See Through The Window: A Linguistic Approach To Windowing And Term Variation
"... Windowing techniques play a key role in information retrieval. Previous works have suggested that the quality of access to information relies heavily on the characteristics of the windows. This study provides a linguistic approach to text windowing through an extraction of term variants with the hel ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
Windowing techniques play a key role in information retrieval. Previous works have suggested that the quality of access to information relies heavily on the characteristics of the windows. This study provides a linguistic approach to text windowing through an extraction of term variants with the help of a partial parser. The syntactic grounding of the method ensures that words observed within restricted spans are lexically related and that spurious word co-occurrences are ruled out with a good level of confidence. The system is computationally tractable on large corpora and large lists of terms. Illustrative examples of term variations from a large medical corpus are given. An experimental evaluation of the method shows that only a small proportion of co-occurring words are lexically related and motivates the call for natural language parsing techniques in text windowing. 1. INTRODUCTION The notion of text window -- a span of contiguous words within a document -- is crucial for severa...
A Symbolic and Surgical Acquisition of Terms Through Variation
- Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing
, 1996
"... . Terminological acquisition is an important issue in learning for Natural Language Processing (NLP) due to the constant terminological renewal through technological changes. Terms play a key role in several NLP-activities such as machine translation, automatic indexing or text understanding. In ..."
Abstract
- Add to MetaCart
. Terminological acquisition is an important issue in learning for Natural Language Processing (NLP) due to the constant terminological renewal through technological changes. Terms play a key role in several NLP-activities such as machine translation, automatic indexing or text understanding. In opposition to classical once-and-for-all approaches, this paper proposes an incremental process for terminological enrichment which operates on existing reference lists and large corpora. Candidate terms are acquired by extracting variants of reference terms through FASTR (FAst Syntactic Term Recognizer), a unificationbased partial parser. As acquisition is performed within specific morphosyntactic contexts (coordinations, insertions or permutations of complex nominals), rich conceptual links are learned together with candidate terms. A clustering of terms related through coordinations yields classes of conceptually close terms while graphs resulting from insertions denote generic...
Dynamic Programming of Partial Parses
, 2001
"... The last years have seen a renewal of interest in applying dynamic programming to natural language processing. The main advantage is the compactness of the representations, which is turning this paradigm into a common way of dealing with highly redundant computations related to phenomena such as ..."
Abstract
- Add to MetaCart
The last years have seen a renewal of interest in applying dynamic programming to natural language processing. The main advantage is the compactness of the representations, which is turning this paradigm into a common way of dealing with highly redundant computations related to phenomena such as non-determinism.
de la historia de revisiones de Wikipedia
"... Resumen: En éste artículo, se analizan las modificaciones accesibles a través del historial de revisiones de Wikipedia en francés. Se define una tipologia de modificaciones basada en el estudio detallado de WiCoPaCo, un recurso gratuito construido a través de un estudio del historial de revisiones d ..."
Abstract
- Add to MetaCart
Resumen: En éste artículo, se analizan las modificaciones accesibles a través del historial de revisiones de Wikipedia en francés. Se define una tipologia de modificaciones basada en el estudio detallado de WiCoPaCo, un recurso gratuito construido a través de un estudio del historial de revisiones de Wikipedia. Conforme a ésta tipologia, detallamos el estudio de la anotación manual de un subconjunto del corpus, con la intención de evaluar la dificultad de la tarea de identificación automática de paráfrasis en el mismo corpus. Finalmente, evaluamos una herramienta de identificación de paráfrasis a base de reglas. Palabras clave: Wikipedia, revisiones, identificación de paráfrasis Abstract: In this article, we analyse the modifications available in the French Wikipedia revision history. We define a typology of modifications based on a detailed study of WiCoPaCo, a freely-available resource built by automatically mining Wikipedia’s revision history. Based on this typology, we detail a manual annotation study of a subpart of the corpus aimed at assessing the difficulty of automatic paraphrase identification in such a corpus. Finally, we assess a rule-based paraphrase identification tool.

