Results 1 -
7 of
7
The SOMLib Digital Library System
- In Proc. Europ. Conf. on Research and Advanced Technology for Digital Libraries (ECDL99
, 1999
"... . Digital Libraries have gained tremendous interest with several research projects addressing the wealth of challenges in this field. While computational intelligence systems are being used for specific tasks in this arena, the majority of projects relies on conventional techniques for the basic str ..."
Abstract
-
Cited by 35 (16 self)
- Add to MetaCart
. Digital Libraries have gained tremendous interest with several research projects addressing the wealth of challenges in this field. While computational intelligence systems are being used for specific tasks in this arena, the majority of projects relies on conventional techniques for the basic structure of the library itself. With the SOMLib project we created a digital library system that uses a neural network-based core for the representation of the library. The self-organizing map, a popular unsupervised neural network model, is used to topically structure a document collection similar to the organization of real-world libraries. Based on this core, additional modules provide information retrieval features, integrate distributed libraries, and automatically label the various topical sections in the document collection. A metaphor graphics based interface further assists the user in intuitively understanding the library providing an instant overview. Keywords: Self-Organizing Map ...
Learning Implicit User Interest Hierarchy for Context in Personalization
- In Proc. of International Conference on Intelligent User Interface (IUI
, 2003
"... To provide a more robust context for personalization, we desire to extract a continuum of general (long-term) to specific (short-term) interests of a user. Our proposed approach is to learn a user interest hierarchy (UIH) from a set of web pages visited by a user. We devise a divisive hierarchical c ..."
Abstract
-
Cited by 32 (4 self)
- Add to MetaCart
To provide a more robust context for personalization, we desire to extract a continuum of general (long-term) to specific (short-term) interests of a user. Our proposed approach is to learn a user interest hierarchy (UIH) from a set of web pages visited by a user. We devise a divisive hierarchical clustering (DHC) algorithm to group words (topics) into a hierarchy where more general interests are represented by a larger set of words. Each web page can then be assigned to nodes in the hierarchy for further processing in learning and predicting interests. This approach is analogous to building a subject taxonomy for a library catalog system and assigning books to the taxonomy. Our approach does not need user involvement and learns the UIH "implicitly." Furthermore, it allows the original objects, web pages, to be assigned to multiple topics (nodes in the hierarchy). In this paper, we focus on learning the UIH from a set of visited pages. We propose a few similarity functions and dynamic threshold-funding methods, and evaluate the resulting hierarchies according to their meaningfulhess and shape.
Text Mining with Information Extraction
- AAAI 2002 Spring Symposium on Mining Answers from Texts and Knowledge Bases
, 2002
"... The popularity of the Web and the large number of documents available in electronic form has motivated the search for hidden knowledge in text collections. Consequently, there is growing research interest in the general topic of text mining. In this paper, we develop a text-mining system by integrat ..."
Abstract
-
Cited by 21 (0 self)
- Add to MetaCart
The popularity of the Web and the large number of documents available in electronic form has motivated the search for hidden knowledge in text collections. Consequently, there is growing research interest in the general topic of text mining. In this paper, we develop a text-mining system by integrating methods from Information Extraction (IE) and Data Mining (Knowledge Discovery from Databases or KDD). By utilizing existing IE and KDD techniques, text-mining systems can be developed relatively rapidly and evaluated on existing text corpora for testing IE systems. We present a general text-mining framework called DiscoTEX which employs an IE module for transforming natural-language documents into structured data and a KDD module for discovering prediction rules from the extracted data. When discovering patterns in extracted text, strict matching of strings is inadequate because textual database entries generally exhibit variations due to typographical errors, misspellings, abbreviations, and other
Identifying variable-length meaningful phrases with correlation functions
- IEEE International Conference on Tools with Artificial Intelligence, IEEE
"... Finding meaningful phrases in a document has been studied in various information retrieval systems in order to improve the performance. Many previous statistical phrase-finding methods had a different aim such as document classification. Some are hybridized with statistical and syntactic grammatical ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Finding meaningful phrases in a document has been studied in various information retrieval systems in order to improve the performance. Many previous statistical phrase-finding methods had a different aim such as document classification. Some are hybridized with statistical and syntactic grammatical methods; others use correlation heuristics between words. We propose a new phrase-finding algorithm that adds correlated words one by one to the phrases found in the previous stage, maintaining high correlation within a phrase. Our results indicate that our algorithm finds more meaningful phrases than an existing algorithm. Furthermore, the previous algorithm could be improved by applying different correlation functions. 1.
Automated Alignment and Extraction of Bilingual Domain Ontology for Medical Domain Web Search
"... This paper proposes an approach to automated ontology alignment and domain ontology extraction from two knowledge bases. First, WordNet and HowNet knowledge bases are aligned to construct a bilingual universal ontology based on the co-occurrence of the words in a parallel corpus. The bilingual unive ..."
Abstract
- Add to MetaCart
This paper proposes an approach to automated ontology alignment and domain ontology extraction from two knowledge bases. First, WordNet and HowNet knowledge bases are aligned to construct a bilingual universal ontology based on the co-occurrence of the words in a parallel corpus. The bilingual universal ontology has the merit that it contains more structural and semantic information coverage from two complementary knowledge bases, WordNet and HowNet. For domain-specific applications, a medical domain ontology is further extracted from the universal ontology using the islanddriven algorithm and a medical domain corpus. Finally, the domain-dependent terms and some axioms between medical terms based on a medical encyclopaedia are added into the ontology. For ontology evaluation, experiments on web search were conducted using the constructed ontology. The experimental results show that the proposed approach can automatically align and extract the domain-specific ontology. In addition, the extracted ontology also shows its promising ability for medical web search. 1
Workshop Co-Chairs:
, 2004
"... Cover art production by Myra Spiliopoulou and MDM/KDD 2004 logo by Valery A. ..."
Abstract
- Add to MetaCart
Cover art production by Myra Spiliopoulou and MDM/KDD 2004 logo by Valery A.

