Results 11 - 20
of
133
Sentence Similarity Based on Semantic Nets and Corpus Statistics
- James O’Shea, Zuhair Bandar and Keeley Crockett
"... 1 Abstract: Sentence similarity measures play an increasingly important role in textrelated research and applications in areas such as text mining, web page retrieval and dialogue systems. Existing methods for computing sentence similarity have been adopted from approaches used for long text documen ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
1 Abstract: Sentence similarity measures play an increasingly important role in textrelated research and applications in areas such as text mining, web page retrieval and dialogue systems. Existing methods for computing sentence similarity have been adopted from approaches used for long text documents. These methods process sentences in a very high dimensional space and are consequently inefficient, require human input and are not adaptable to some application domains. This paper focuses directly on computing the similarity between very short texts of sentence length. It presents an algorithm that takes account of semantic information and word order information implied in the sentences. The semantic similarity of two sentences is calculated using information from a structured lexical database and from corpus statistics. The use of a lexical database enables our method to model human common sense knowledge and the incorporation of corpus statistics allows our method to be adaptable to different domains. The proposed method can be used in a variety of applications that involve text knowledge representation and discovery. Experiments on two sets of selected sentence pairs demonstrate that the proposed method provides a similarity measure that shows a significant correlation to human intuition.
Combining Symbolic and Distributional Models of Meaning
- Proceedings of AAAI Spring Symposium on Quantum Interaction
, 2007
"... The are two main approaches to the representation of meaning in Computational Linguistics: a symbolic approach and a distributional approach. This paper considers the fundamental question of how these approaches might be combined. The proposal is to adapt a method from the Cognitive Science literatu ..."
Abstract
-
Cited by 17 (2 self)
- Add to MetaCart
The are two main approaches to the representation of meaning in Computational Linguistics: a symbolic approach and a distributional approach. This paper considers the fundamental question of how these approaches might be combined. The proposal is to adapt a method from the Cognitive Science literature, in which symbolic and connectionist representations are combined using tensor products. Possible applications of this method for language processing are described. Finally, a potentially fruitful link between Quantum Mechanics, Computational Linguistics, and other related areas such as Information Retrieval and Machine Learning, is proposed.
Supporting Component-Based Software Development with Active Component Repository Systems
, 2001
"... ..."
An exploration of statistical models of automated test case generation
- In Proceedings of the Third International Workshop on Dynamic Analysis
, 2005
"... In this paper, we develop methods that use logged user data to build models of a web application. Logged user data captures dynamic behavior of an application that can be useful for addressing the challenging problems of testing web applications. Our approach automatically builds statistical models ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
In this paper, we develop methods that use logged user data to build models of a web application. Logged user data captures dynamic behavior of an application that can be useful for addressing the challenging problems of testing web applications. Our approach automatically builds statistical models of user sessions and automatically derives test cases from these models. We provide several alternative modeling approaches based on statistical machine learning methods. We investigate the effectiveness of the test suites generated from our methods by performing a preliminary study that evaluates the generated test cases. The results of this study demonstrate that our techniques are able to generate test cases that achieve high coverage and accurately model user behavior. This study provides insights into improving our methods and motivates a larger study with a more diverse set of applications and testing metrics. 1.
The Knowledge Required to Interpret Noun Compounds
"... Noun compound interpretation is the task of determining the semantic relations among the constituents of a noun compound. For example, “concrete floor” means a floor made of concrete, while “gymnasium floor” is the floor region of a gymnasium. We would like to enable knowledge acquisition systems to ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
Noun compound interpretation is the task of determining the semantic relations among the constituents of a noun compound. For example, “concrete floor” means a floor made of concrete, while “gymnasium floor” is the floor region of a gymnasium. We would like to enable knowledge acquisition systems to interpret noun compounds, as part of their overall task of translating imprecise and incomplete information into formal representations that support automated reasoning. However, if interpreting noun compounds requires detailed knowledge of the constituent nouns, then it may not be worth doing: the cost of acquiring this knowledge may outweigh the potential benefit. This paper describes an empirical investigation of the knowledge required to interpret noun compounds. It concludes that the axioms and ontological distinctions important for this task are derived from the top levels of a hierarchical knowledge base (KB); detailed knowledge of specific nouns is less important. This is good news, not only for our work on knowledge acquisition systems, but also for research on text understanding, where noun compound interpretation has a long history.
Summarizing emails with conversational cohesion and subjectivity
- In ACL-08: HLT: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies
, 2008
"... In this paper, we study the problem of summarizing email conversations. We first build a sentence quotation graph that captures the conversation structure among emails. We adopt three cohesion measures: clue words, semantic similarity and cosine similarity as the weight of the edges. Second, we use ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
In this paper, we study the problem of summarizing email conversations. We first build a sentence quotation graph that captures the conversation structure among emails. We adopt three cohesion measures: clue words, semantic similarity and cosine similarity as the weight of the edges. Second, we use two graph-based summarization approaches, Generalized ClueWordSummarizer and Page-Rank, to extract sentences as summaries. Third, we propose a summarization approach based on subjective opinions and integrate it with the graph-based ones. The empirical evaluation shows that the basic clue words have the highest accuracy among the three cohesion measures. Moreover, subjective words can significantly improve accuracy. 1
Word sense disambiguation in queries
- In ACM Conference on Information and Knowledge Management (CIKM2005
, 2005
"... This paper presents a new approach to determine the senses of words in queries by using WordNet. In our approach, noun phrases in a query are determined first. For each word in the query, information associated with it, including its synonyms, hyponyms, hypernyms, definitions of its synonyms and hyp ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
This paper presents a new approach to determine the senses of words in queries by using WordNet. In our approach, noun phrases in a query are determined first. For each word in the query, information associated with it, including its synonyms, hyponyms, hypernyms, definitions of its synonyms and hyponyms, and its domains, can be used for word sense disambiguation. By comparing these pieces of information associated with the words which form a phrase, it may be possible to assign senses to these words. If the above disambiguation fails, then other query words, if exist, are used, by going through exactly the same process. If the sense of a query word cannot be determined in this manner, then a guess of the sense of the word is made, if the guess has at least 50 % chance of being correct. If no sense of the word has 50 % or higher chance of being used, then we apply a Web search to assist in the word sense disambiguation process. Experimental results show that our approach has 100% applicability and 90 % accuracy on the most recent robust track of TREC collection of 250 queries. We combine this disambiguation algorithm to our retrieval system to examine the effect of word sense disambiguation in text retrieval. Experimental results show that the disambiguation algorithm together with other components of our retrieval system yield a result which is 13.7 % above that produced by the same system but without the disambiguation, and 9.2 % above that produced by using Lesk’s algorithm. Our retrieval effectiveness is 7 % better than the best reported result in the literature.
Pragmatics and Computational Linguistics
- Handbook of Pragmatics
, 2003
"... Introduction These days there's a computational version of everything. Computational biology, computational musicology, computational archaeology, and so on, ad infinitum. Even movies are going digital. This chapter, as you might have guessed by now, thus explores the computational side of pragmati ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
Introduction These days there's a computational version of everything. Computational biology, computational musicology, computational archaeology, and so on, ad infinitum. Even movies are going digital. This chapter, as you might have guessed by now, thus explores the computational side of pragmatics. Computational pragmatics might be defined as the computational study of the relation between utterances and context. Like other kinds of pragmatics, this means that computational pragmatics is concerned with indexicality, with the relation between utterances and action, with the relation between utterances and discourse, and with the relationship between utterances and the place, time, and environmental context of their being uttered. As Bunt and Black (2000) point out, computational pragmatics, like pragmatics in general, is especially concerned with INFERENCE. Four core inferential problems in pragmatics have received the most attention in the computational com

