Results 1 - 10
of
30
Using Multiple Knowledge Sources for Word Sense Discrimination
- COMPUTATIONAL LINGUISTICS
, 1992
"... This paper addresses the problem of how to identify the intended meaning of individual words in unrestricted texts, without necessarily having access to complete representations of sentences. To discriminate senses, an understander can consider a diversity of information, including syntactic tags, w ..."
Abstract
-
Cited by 95 (1 self)
- Add to MetaCart
This paper addresses the problem of how to identify the intended meaning of individual words in unrestricted texts, without necessarily having access to complete representations of sentences. To discriminate senses, an understander can consider a diversity of information, including syntactic tags, word frequencies, collocations, semantic context, role-related expectations, and syntactic restrictions. However, current approaches make use of only small subsets of this information. Here we will describe how to use the whole range of information. Our discussion will include how the preference cues relate to general lexical and conceptual knowledge and to more specialized knowledge of collocations and contexts. We will describe a method of combining cues on the basis of their individual specificity, rather than a fixed ranking among cue-types. We will also discuss an application of the approach in a system that computes sense tags for arbitrary texts, even when it is unable to determine a single syntactic or semantic representation for some sentences.
Towards text knowledge engineering
- IN AAAI/IAAI
, 1998
"... We introduce a methodology for automating the maintenance of domain-specific taxonomies based on natural language text understanding. A given ontology is incrementally updated as new concepts are acquired from real-world texts. The acquisition process is centered around the linguistic and conceptua ..."
Abstract
-
Cited by 83 (10 self)
- Add to MetaCart
We introduce a methodology for automating the maintenance of domain-specific taxonomies based on natural language text understanding. A given ontology is incrementally updated as new concepts are acquired from real-world texts. The acquisition process is centered around the linguistic and conceptual “quality” of various forms of evidence underlying the generation and refinement of concept hypotheses. On the basis of the quality of evidence, concept hypotheses are ranked according to credibility and the most credible ones are selected for assimilation into the domain knowledge base.
AbstFinder, A Prototype Natural Language Text Abstraction Finder for Use in Requirements Elicitation
- Automated Software Engineering
, 1997
"... Abstract. Abstraction identification is named as a key problem in requirements analysis. Typically, the abstractions must be found among the large mass of natural language text collected from the clients and users. This paper motivates and describes a new approach, based on traditional signal proces ..."
Abstract
-
Cited by 42 (0 self)
- Add to MetaCart
Abstract. Abstraction identification is named as a key problem in requirements analysis. Typically, the abstractions must be found among the large mass of natural language text collected from the clients and users. This paper motivates and describes a new approach, based on traditional signal processing methods, for finding abstractions in natural language text and offers a new tool, AbstFinder as an implementation of this approach. The advantages and disadvantages of the approach and the design of the tool are discussed in detail. Various scenarios for use of the tool are offered. Some of these scenarios were used in case study of the effectiveness of the tool on an industrial-strength example of finding abstractions in a request for proposals.
Ultra-Summarization: A Statistical Approach to Generating Highly Condensed Non-Extractive Summaries
- In SIGIR99
, 1999
"... Using current extractive summarization techniques, it is impossible to produce a coherent document summary shorter than a single sentence, or to produce a summary that conforms to particular stylistic constraints. Ideally, one would prefer to understand the document, and to generate an appropriate s ..."
Abstract
-
Cited by 41 (0 self)
- Add to MetaCart
Using current extractive summarization techniques, it is impossible to produce a coherent document summary shorter than a single sentence, or to produce a summary that conforms to particular stylistic constraints. Ideally, one would prefer to understand the document, and to generate an appropriate summary directly from the results of that understanding. Absent a comprehensive natural language understanding system, an approximation must be used. This paper presents an alternative statistical model of a summarization process, which jointly applies statistical models of the term selection and term ordering process to produce brief coherent summaries in a style learned from a training corpus. 1 Introduction Summarization is one of the most important capabilities required in writing. Effective summarization, like effective writing, is neither easy nor innate; rather, it is a skill that is developed through instruction and practice [Hidi and Anderson, 1986; Hooper et al., 1994] . Generating...
Generating Indicative-Informative Summaries with SumUM
- Computational Linguistics
, 2002
"... We present and evaluate SumUM, a text summarization system that takes a raw technical text as input and produces an indicative informative summary. The indicative part of the summary identifies the topics of the document, and the informative part elaborates on some of these topics according to the r ..."
Abstract
-
Cited by 28 (7 self)
- Add to MetaCart
We present and evaluate SumUM, a text summarization system that takes a raw technical text as input and produces an indicative informative summary. The indicative part of the summary identifies the topics of the document, and the informative part elaborates on some of these topics according to the reader’s interest. SumUM motivates the topics, describes entities, and defines concepts. It is a first step for exploring the issue of dynamic summarization. This is accomplished through a process of shallow syntactic and semantic analysis, concept identification, and text regeneration. Our method was developed through the study of a corpus of abstracts written by professional abstractors. Relying on human judgment, we have evaluated indicativeness, informativeness, and text acceptability of the automatic summaries. The results thus far indicate good performance when compared with other summarization technologies. 1.
Automatically Generating Hypertext By Computing Semantic Similarity
, 1997
"... We describe a novel method for automatically generating hypertext links within and between newspaper articles. The method is based on lexical chaining, a technique for extracting the sets of related words that occur in texts. Links between the paragraphs of a single article are built by considering ..."
Abstract
-
Cited by 26 (3 self)
- Add to MetaCart
We describe a novel method for automatically generating hypertext links within and between newspaper articles. The method is based on lexical chaining, a technique for extracting the sets of related words that occur in texts. Links between the paragraphs of a single article are built by considering the distribution of the lexical chains in that article. Links between articles are built by considering how the chains in the two articles are related. By using lexical chaining we mitigate the problems of synonymy and polysemy that plague traditional information retrieval approaches to automatic hypertext generation. In order to motivate our research, we discuss the results of a study that shows that humans are inconsistent when assigning hypertext links within newspaper articles. Even if humans were consistent, the time needed to build a large hypertext and the costs associated with the production of such a hypertext make relying on human linkers an untenable decision. Thus we are left to ...
Tagging for Learning: Collecting Thematic Relations from Corpus
, 1990
"... Recent work in text analysis has suggested that data on words that frequently occur to- gether reveal important information about text content. Co-occurrence relations can serve two main purposes in language procossing. First, the statistics of co-occurrence have been shown to produce accurate resul ..."
Abstract
-
Cited by 20 (0 self)
- Add to MetaCart
Recent work in text analysis has suggested that data on words that frequently occur to- gether reveal important information about text content. Co-occurrence relations can serve two main purposes in language procossing. First, the statistics of co-occurrence have been shown to produce accurate results iu syntactic analysis. Second, the way that words appear together can help in assigning thematic roles in semantic interpretation. This paper discusses a method for collecting co-occurrence data, acquiring lexical relations from the data, and applying these relations to semantic analysis.
Ontology Engineering Via Text Understanding
- IN PROCEEDINGS OF THE 15TH WORLD COMPUTER CONGRESS ’THE GLOBAL INFORMATION SOCIETY ON THE WAY TO THE NEXT MILLENIUM’ (IFIP’98
, 1998
"... We introduce a methodology for automating the maintenance of domain-specific ontologies based on natural language text understanding. A given taxonomy is incrementally updated as new concepts are acquired from real-world texts. The acquisition process is centered around the linguistic and conceptual ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
We introduce a methodology for automating the maintenance of domain-specific ontologies based on natural language text understanding. A given taxonomy is incrementally updated as new concepts are acquired from real-world texts. The acquisition process is centered around the linguistic and conceptual "quality" of various forms of evidence underlying the generation and refinement of concept hypotheses. On the basis of the quality of evidence, concept hypotheses are ranked according to credibility and the most credible ones are selected for assimilation into the domain knowledge base.

