Results 1 -
4 of
4
ExtMiner: Combining Multiple Ranking and Clustering Algorithms for Structured Document Retrieval
- In: Proceedings of International workshop on Integrating Data Mining, Databases and Information Retrieval (IDDI’05), 16th International Workshop on Database and Expert Systems Applications
, 2005
"... This paper introduces ExtMiner, a platform and potential tool for information management in SMEs (Small& Medium-size Enterprise), or for organizational workgroups. ExtMiner supports interactive and iterative clustering of documents. It provides users with a visual clusterand list views at the same t ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
This paper introduces ExtMiner, a platform and potential tool for information management in SMEs (Small& Medium-size Enterprise), or for organizational workgroups. ExtMiner supports interactive and iterative clustering of documents. It provides users with a visual clusterand list views at the same time, supporting iterative search process. ExtMiner may also be applied as a platform for research on retrieval fusion, since it combines search, clustering and visualization algorithms. ExtMiner was evaluated with three document collections. Although the findings were encouraging the user interface and performance with large document repositories need further development. 1.
Dynamic Tuning for Fusion: Harnessing Human Intelligence to Optimize System Performance
- Proceedings of the 9 th World Multi-Conference on Systemics, Cybernetics and Informatics
, 2005
"... This paper describes a Web search optimization study that investigates both static and dynamic tuning methods for optimizing system performance. The study shows that combining multiple sources of evidence effectively via static and dynamic tuning of fusion formula enhances retrieval performance on t ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This paper describes a Web search optimization study that investigates both static and dynamic tuning methods for optimizing system performance. The study shows that combining multiple sources of evidence effectively via static and dynamic tuning of fusion formula enhances retrieval performance on the Web.
Fusion Approach to Finding Opinions in Blogosphere
"... In this paper, we describe a fusion approach to finding opinion about a given target in blog postings. We tackled the opinion blog retrieval task by breaking it down to two sequential subtasks: ontopic retrieval followed by opinion classification. Our opinion retrieval approach was to first apply tr ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In this paper, we describe a fusion approach to finding opinion about a given target in blog postings. We tackled the opinion blog retrieval task by breaking it down to two sequential subtasks: ontopic retrieval followed by opinion classification. Our opinion retrieval approach was to first apply traditional IR methods to retrieve on-topic blogs, and then boost the ranks of opinionated blogs using combined opinion scores generated by four opinion assessment methods. Our opinion module consists of Opinion Term Module, which identify opinions based on the frequency of opinion terms (i.e., terms that only occur frequently in opinion blogs), Rare Term Module, which uses uncommon/rare terms (e.g., “sooo good”) for opinion classification, IU Module, which uses IU (I and you) collocations, and Adjective-Verb Module, which uses computational linguistics ’ distribution similarity approach to learn the subjective language from training data.
Using Document Dimensions for Enhanced Information Retrieval
"... Abstract. Conventional document search techniques are constrained by attempting to match individual keywords or phrases to source documents. Thus, these techniques miss out documents that contain semantically similar terms, thereby achieving a relatively low degree of recall. At the same time, proce ..."
Abstract
- Add to MetaCart
Abstract. Conventional document search techniques are constrained by attempting to match individual keywords or phrases to source documents. Thus, these techniques miss out documents that contain semantically similar terms, thereby achieving a relatively low degree of recall. At the same time, processing capabilities and tools for syntactic and semantic analysis of language have advanced to the point where an indextime linguistic analysis of source documents is both feasible and realistic. In this paper, we introduce document dimensions, a means of classifying or grouping terms discovered in documents. Using an enhanced version of Jakarta Lucene[1], we demonstrate that supplementing keyword analysis with some syntactic and semantic information can indeed enhance the quality of information retrieval results. 1

