Results 1 - 10
of
43
Dependency-based construction of semantic space models
- Computational Linguistics
, 2007
"... Traditionally, vector-based semantic space models use word co-occurrence counts from large corpora to represent lexical meaning. In this article we present a novel framework for constructing semantic spaces that take syntactic relations into account. We introduce a formalization for this class of mo ..."
Abstract
-
Cited by 79 (6 self)
- Add to MetaCart
Traditionally, vector-based semantic space models use word co-occurrence counts from large corpora to represent lexical meaning. In this article we present a novel framework for constructing semantic spaces that take syntactic relations into account. We introduce a formalization for this class of models which allows linguistic knowledge to guide the construction process. We evaluate our framework on a range of tasks relevant for cognitive science and natural language processing: semantic priming, synonymy detection and word sense disambiguation. In all cases, our framework obtains results that are comparable or superior to the state of the art. 1.
Scientific Paper Summarization Using Citation Summary Networks
"... Quickly moving to a new area of research is painful for researchers due to the vast amount of scientific literature in each field of study. One possible way to overcome this problem is to summarize a scientific topic. In this paper, we propose a model of summarizing a single article, which can be fu ..."
Abstract
-
Cited by 26 (9 self)
- Add to MetaCart
Quickly moving to a new area of research is painful for researchers due to the vast amount of scientific literature in each field of study. One possible way to overcome this problem is to summarize a scientific topic. In this paper, we propose a model of summarizing a single article, which can be further used to summarize an entire topic. Our model is based on analyzing others’ viewpoint of the target article’s contributions and the study of its citation summary network using a clustering approach. 1
Learning to Detect Conversation Focus of Threaded Discussions
- In Proceedings of HLTNAACL 2006
, 2006
"... In this paper we present a novel feature enriched approach that learns to detect the conversation focus of threaded discussions by combining NLP analysis and IR techniques. Using the graph-based algorithm HITS, we integrate different features such as lexical similarity, poster trustworthiness, and s ..."
Abstract
-
Cited by 17 (2 self)
- Add to MetaCart
In this paper we present a novel feature enriched approach that learns to detect the conversation focus of threaded discussions by combining NLP analysis and IR techniques. Using the graph-based algorithm HITS, we integrate different features such as lexical similarity, poster trustworthiness, and speech act analysis of human conversations with feature oriented link generation functions. It is the first quantitative study to analyze human conversation focus in the context of online discussions that takes into account heterogeneous sources of evidence. Experimental results using a threaded discussion corpus from an undergraduate class show that it achieves significant performance improvements compared with the baseline system. 1
Improving diversity in ranking using absorbing random walks
- Physics Laboratory – University of Washington
, 2007
"... We introduce a novel ranking algorithm called GRASSHOPPER, which ranks items with an emphasis on diversity. That is, the top items should be different from each other in order to have a broad coverage of the whole item set. Many natural language processing tasks can benefit from such diversity ranki ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
We introduce a novel ranking algorithm called GRASSHOPPER, which ranks items with an emphasis on diversity. That is, the top items should be different from each other in order to have a broad coverage of the whole item set. Many natural language processing tasks can benefit from such diversity ranking. Our algorithm is based on random walks in an absorbing Markov chain. We turn ranked items into absorbing states, which effectively prevents redundant items from receiving a high rank. We demonstrate GRASSHOP-PER’s effectiveness on extractive text summarization: our algorithm ranks between the 1st and 2nd systems on DUC 2004 Task 2; and on a social network analysis task that identifies movie stars of the world. 1
A system for query-specific document summarization
- In 15th ACM international conference on Information and knowledge management (CIKM
, 2006
"... There has been a great amount of work on query-independent summarization of documents. However, due to the success of Web search engines query-specific document summarization (query result snippets) has become an important problem, which has received little attention. We present a method to create q ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
There has been a great amount of work on query-independent summarization of documents. However, due to the success of Web search engines query-specific document summarization (query result snippets) has become an important problem, which has received little attention. We present a method to create queryspecific summaries by identifying the most query-relevant fragments and combining them using the semantic associations within the document. In particular, we first add structure to the documents in the preprocessing stage and convert them to document graphs. Then, the best summaries are computed by calculating the top spanning trees on the document graphs. We present and experimentally evaluate efficient algorithms that support computing summaries in interactive time. Furthermore, the quality of our summarization method is compared to current approaches using a user survey.
Extending answers using discourse structure
- In RANLP Workshop on Crossing Barriers in Text Summarization Research
, 2005
"... Research on Question Answering is focused mainly on classifying the question type and finding the answer, while presenting the answer in a way that suits the user’s needs has received little attention. This paper shows how existing question answering systems can be improved by exploiting Rhetorical ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
Research on Question Answering is focused mainly on classifying the question type and finding the answer, while presenting the answer in a way that suits the user’s needs has received little attention. This paper shows how existing question answering systems can be improved by exploiting Rhetorical Structure Theory-based summarization techniques in order to extract more than just the exact answer from the document in which the answer resides. The output is an extensive answer, which also provides additional information related to the question, and which may give the user an opportunity to assess the accuracy of the answer (is this what I am looking for?). A first experiment confirms that the proposed summarization method performs better than a baseline summarization method. 1
Using Citations to Generate Surveys of Scientific Paradigms
"... The number of research publications in various disciplines is growing exponentially. Researchers and scientists are increasingly finding themselves in the position of having to quickly understand large amounts of technical material. In this paper we present the first steps in producing an automatica ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
The number of research publications in various disciplines is growing exponentially. Researchers and scientists are increasingly finding themselves in the position of having to quickly understand large amounts of technical material. In this paper we present the first steps in producing an automatically generated, readily consumable, technical survey. Specifically we explore the combination of citation information and summarization techniques. Even though prior work (Teufel et al., 2006) argues that citation text is unsuitable for summarization, we show that in the framework of multi-document survey creation, citation texts can play a crucial role. 1
Enhancing Single-document Summarization by Combining RankNet and Third-party Sources
"... We present a new approach to automatic summarization based on neural nets, called NetSum. We extract a set of features from each sentence that helps identify its importance in the document. We apply novel features based on news search query logs and Wikipedia entities. Using the RankNet learning alg ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
We present a new approach to automatic summarization based on neural nets, called NetSum. We extract a set of features from each sentence that helps identify its importance in the document. We apply novel features based on news search query logs and Wikipedia entities. Using the RankNet learning algorithm, we train a pair-based sentence ranker to score every sentence in the document and identify the most important sentences. We apply our system to documents gathered from CNN.com, where each document includes highlights and an article. Our system significantly outperforms the standard baseline in the ROUGE-1 measure on over 70 % of our document set. 1
Discovering overlapping communities of named entities
- Knowledge Discovery in Databases: PKDD 2006 (LNCS 4213
, 2006
"... Abstract. Although community discovery based on social network analysis has been studied extensively in the Web hyperlink environment, limited research has been done in the case of named entities in text documents. The cooccurrence of entities in documents usually implies some connections among them ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Abstract. Although community discovery based on social network analysis has been studied extensively in the Web hyperlink environment, limited research has been done in the case of named entities in text documents. The cooccurrence of entities in documents usually implies some connections among them. Investigating such connections can reveal important patterns. In this paper, we mine communities among named entities in Web documents and text corpus. Most existing works on community discovery generate a partition of the entity network, assuming each entity belongs to one community. However, in the scenario of named entities, an entity may participate in several communities. For example, a person is in the communities of his/her family, colleagues, and friends. In this paper, we propose a novel technique to mine overlapping communities of named entities. This technique is based on triangle formation, expansion, and clustering with content similarity. Our experimental results show that the proposed technique is highly effective.
Mining Community Structure of Named Entities from Free Text
- in Conference on Information and Knowledge Management
, 2005
"... Although community discovery based on social network has been studied extensively in the Web hyperlink environment, limited research has been done in the case of Web documents. The co-occurrence of Words and entities in sentences and documents usually implies some connections among them. Studying su ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Although community discovery based on social network has been studied extensively in the Web hyperlink environment, limited research has been done in the case of Web documents. The co-occurrence of Words and entities in sentences and documents usually implies some connections among them. Studying such connections may reveal important relationships. In this paper, we investigate the cooccurrences of named entities in Web pages and blogs, and mine communities among those entities. We show that identifying communities in such an environment can be transformed into a graph clustering problem. A hierarchical clustering algorithm is then proposed, which exploits triangle structures within the graph and the mutual information between vertices. Our empirical study shows that the proposed algorithm is promising in discovering communities from Web documents.

