Results 1 - 10
of
14
Learning Text Analysis Rules For Domain-Specific Natural Language Processing
, 1997
"... An enormous amount of knowledge is needed to infer the meaning of unrestricted natural language. The problem can be reduced to a manageable size by restricting attention to a specific domain, which is a corpus of texts together with a predefined set of concepts that are of interest to that domain. T ..."
Abstract
-
Cited by 32 (5 self)
- Add to MetaCart
An enormous amount of knowledge is needed to infer the meaning of unrestricted natural language. The problem can be reduced to a manageable size by restricting attention to a specific domain, which is a corpus of texts together with a predefined set of concepts that are of interest to that domain. Two widely different domains are used to illustrate this domain-specific approach. One domain is a collection of Wall Street Journal articles in which the target concept is management succession events: identifying persons moving into corporate management positions or moving out. A second domain is a collection of hospital discharge summaries in which the target concepts are various classes of diagnosis or symptom.
Using small random samples for the manual evaluation of statistical association measures
- COMPUTER SPEECH AND LANGUAGE
, 2004
"... In this paper, we describe the empirical evaluation of statistical association measures for the extraction of lexical collocations from text corpora. We argue that the results of an evaluation experiment cannot easily be generalized to a different setting. Consequently, such experiments have to be c ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
In this paper, we describe the empirical evaluation of statistical association measures for the extraction of lexical collocations from text corpora. We argue that the results of an evaluation experiment cannot easily be generalized to a different setting. Consequently, such experiments have to be carried out under conditions that are as similar as possible to the intended use of the measures. Finally, we show how an evaluation strategy based on random samples can reduce the amount of manual annotation work significantly, making it possible to perform many more evaluation experiments under specific conditions.
Covarying Collexemes in the Into-causative
- Empirical and Experimental Methods in Cognitive/Functional Research
, 2004
"... this paper we extend a `single-slot' methodology developed in Stefanowitsch and Gries (2003) to the investigation of potential interactions between two slots and apply it to the into-causative. We show that such interactions exist, i.e. that cause and result predicates `covary' systematically. We th ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
this paper we extend a `single-slot' methodology developed in Stefanowitsch and Gries (2003) to the investigation of potential interactions between two slots and apply it to the into-causative. We show that such interactions exist, i.e. that cause and result predicates `covary' systematically. We then consider two factors influencing this covariation: a cognitive The order of authors is arbitrary. The authors would like to thank Britta Mondorf and Andr Schfer for supplying the raw data from The Guardian used in this study
CRYSTAL: Learning Domain-specific Text Analysis Rules
, 1996
"... An enormous amount of knowledge is needed to infer the meaning of unrestricted natural language. The problem can be reduced to a manageable size by restricting attention to a predefined set of concepts in a specific domain. Two widely different domains are used to illustrate this domain-specific app ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
An enormous amount of knowledge is needed to infer the meaning of unrestricted natural language. The problem can be reduced to a manageable size by restricting attention to a predefined set of concepts in a specific domain. Two widely different domains are used to illustrate this domain-specific approach. One domain is a collection of Wall Street Journal articles in which the target concept is management succession events: identifying persons moving into and out of corporate management positions. A second domain is a collection of hospital discharge summaries in which the target concepts are various classes of diagnosis or symptom. The goal of an information extraction system is to identify references to the concept of interest for a particular domain. Each domain needs a set of text extraction rules based on the vocabulary, semantic classes, and writing style peculiar to the domain and the target concept. This paper presents CRYSTAL, an implemented system that automatically induces domain...
Personalized Web Search with Location Preferences
"... Abstract — As the amount of Web information grows rapidly, search engines must be able to retrieve information according to the user’s preference. In this paper, we propose a new web search personalization approach that captures the user’s interests and preferences in the form of concepts by mining ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract — As the amount of Web information grows rapidly, search engines must be able to retrieve information according to the user’s preference. In this paper, we propose a new web search personalization approach that captures the user’s interests and preferences in the form of concepts by mining search results and their clickthroughs. Due to the important role location information plays in mobile search, we separate concepts into content concepts and location concepts, and organize them into ontologies to create an ontology-based, multi-facet (OMF) profile to precisely capture the user’s content and location interests and hence improve the search accuracy. Moreover, recognizing the fact that different users and queries may have different emphases on content and location information, we introduce the notion of content and location entropies to measure the amount of content and location information associated with a query, and click content and location entropies to measure how much the user is interested in the content and location information in the results. Accordingly, we propose to define personalization effectiveness based on the entropies and use it to balance the weights between the content and location facets. Finally, based on the derived ontologies and personalization effectiveness, we train an SVM to adapt a personalized ranking function for re-ranking of future search. We conduct extensive experiments to compare the precision produced by our OMF profiles and that of a baseline method. Experimental results show that OMF improves the precision significantly compared to the baseline. I.
Personalized concept-based clustering of search engine queries
- IEEE Transactions on Knowledge and Data Engineering
, 2008
"... Abstract—A major problem of current Web search is that search queries are usually short and ambiguous, and thus are insufficient for specifying the precise user needs. To alleviate this problem, some search engines suggest terms that are semantically related to the submitted queries so that users ca ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Abstract—A major problem of current Web search is that search queries are usually short and ambiguous, and thus are insufficient for specifying the precise user needs. To alleviate this problem, some search engines suggest terms that are semantically related to the submitted queries so that users can choose from the suggestions the ones that reflect their information needs. In this paper, we introduce an effective approach that captures the user’s conceptual preferences in order to provide personalized query suggestions. We achieve this goal with two new strategies. First, we develop online techniques that extract concepts from the web-snippets of the search result returned from a query and use the concepts to identify related queries for that query. Second, we propose a new twophase personalized agglomerative clustering algorithm that is able to generate personalized query clusters. To the best of the authors’ knowledge, no previous work has addressed personalization for query suggestions. To evaluate the effectiveness of our technique, a Google middleware was developed for collecting clickthrough data to conduct experimental evaluation. Experimental results show that our approach has better precision and recall than the existing query clustering methods. Index Terms—Clickthrough, concept-based clustering, personalization, query clustering, search engine. Ç 1
E.E.: Supporting Temporal Question Answering: Strategies for Offline Data Collection
- International workshop on Inference in Computational Semantics
"... data collection ..."
Extracting Case relations from Corpora
, 1997
"... Description of a system for the automatic acquisition of verbal case frames from corpora. The key target is to acquire domain-specific relations rather than the standard relationships found in general dictionaries. Results of experiments on Ecran (and other) corpora are reported. Status of abstract ..."
Abstract
- Add to MetaCart
Description of a system for the automatic acquisition of verbal case frames from corpora. The key target is to acquire domain-specific relations rather than the standard relationships found in general dictionaries. Results of experiments on Ecran (and other) corpora are reported. Status of abstract Public Received on Recipient's catalogue number ECRAN LE 2110 --- Deliverable 2.4 --- Restricted--- page 3 D-2.4 Extracting Case Relations from Corpora Authors: Roberto Basili, Maria-Teresa Pazienza, Paola Velardi (ANC and TV), Roberta Catizone, Robin Collier, Mark Stevenson, Yorick Wilks (SHE), Olivier Ansaldi, Alpha Luk, Barbara Vauthey (FRI), Jean-Michel Grandchamp (THO) 1 Introduction In this document we describe a system for the automatic acquisition of verbal case frames from corpora.The lexicon is acknowledged as one of the major components of NLP and MT systems. It is broadly agreed that the most succesful implementations of NLP-based systems so far have been those based on lex...

