Results 1 - 10
of
21
Machine learning for information retrieval: neural networks, symbolic learning, and genetic algorithms
- Journal of the American Society for Information Science
, 1995
"... Information retrieval using probabilistic techniques has at-tracted significant attention on the part of researchers in information and computer science over the past few de-cades. In the 198Os, knowledge-based techniques also made an impressive contribution to “intelligent ” informa-tion retrieval ..."
Abstract
-
Cited by 56 (9 self)
- Add to MetaCart
Information retrieval using probabilistic techniques has at-tracted significant attention on the part of researchers in information and computer science over the past few de-cades. In the 198Os, knowledge-based techniques also made an impressive contribution to “intelligent ” informa-tion retrieval and indexing. More recently, information sci-ence researchers have turned to other newer artificial-in-telligence-based inductive learning techniques including neural networks, symbolic learning, and genetic algo-rithms. These newer techniques, which are grounded on diverse paradigms, have provided great opportunities for researchers to enhance the information processing and re-trieval capabilities of current information storage and re-trieval systems. In this article, we first provide an overview of these newer techniques and their use in information science research. To familiarize readers with these tech-niques, we present three popular methods: the connec-tionist Hopfield network; the symbolic ID3/ID5R; and evolu-tion-based genetic algorithms. We discuss their knowl-edge representations and algorithms in the context of information retrieval. Sample implementation and testing results from our own research are also provided for each technique. We believe these techniques are promising in their ability to analyze user queries, identify users ’ infor-mation needs, and suggest alternatives for search. With proper user-system interactions, these methods can greatly complement the prevailing full-text, keyword-based, probabilistic, and knowledge-based techniques.
A Scalable Self-organizing Map Algorithm for Textual Classification: A Neural Network Approach to Thesaurus Generation
- Communication Cognition and Artificial Intelligence, Spring
, 1998
"... : The rapid proliferation of textual and multimedia online databases, digital libraries, Internet servers, and intranet services has turned researchers' and practitioners' dream of creating an information-rich society into a nightmare of info-gluts. Many researchers believe that turning an info-glu ..."
Abstract
-
Cited by 23 (5 self)
- Add to MetaCart
: The rapid proliferation of textual and multimedia online databases, digital libraries, Internet servers, and intranet services has turned researchers' and practitioners' dream of creating an information-rich society into a nightmare of info-gluts. Many researchers believe that turning an info-glut into a useful digital library requires automated techniques for organizing and categorizing large-scale information. This paper presents research in which we sought to develop a scaleable textual classification and categorization system based on the Kohonen's self-organizing feature map (SOM) algorithm. In our paper, we show how self-organization can be used for automatic thesaurus generation. Our proposed data structure and algorithm took advantage of the sparsity of coordinates in the document input vectors and reduced the SOM computational complexity by several order of magnitude. The proposed Scaleable SOM (SSOM) algorithm makes large-scale textual categorization tasks a possibility. A...
Spatialization methods: a cartographic research agenda for non-geographic information visualization
- Cartography and Geographic Information Science
, 2003
"... ABSTRACT: Information visualization is an interdisciplinary research area in which cartographic efforts have mostly addressed the handling of geographic information. Some cartographers have recently become involved in attempts to extend geographic principles and cartographic techniques to the visual ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
ABSTRACT: Information visualization is an interdisciplinary research area in which cartographic efforts have mostly addressed the handling of geographic information. Some cartographers have recently become involved in attempts to extend geographic principles and cartographic techniques to the visualization of non-geographic information. This paper reports on current progress and future opportunities in this emerging research field commonly known as spatialization. The discussion is mainly devoted to the computational techniques that turn high-dimensional data into visualizations via processes of projection and transformation. It is argued that cartographically informed engagement of computationally intensive techniques can help to provide richer and less opaque information visualizations. The discussion of spatialization methods is linked to another priority area of cartographic involvement, the development of theory and principles for cognitively plausible spatialization. The paper distinguishes two equally important sets of challenges for cartographic success in spatialization research. One is the recognition that there are distinct advantages to applying a cartographic perspective in information visualization. This requires our community to more thoroughly understand the essence of cartographic activity and to explore the implications of its metaphoric transfer to non-geographic domains. Another challenge lies in cartographers becoming a more integral part of the information visualization community and actively engaging its constituent research fields.
Research Commentary: Introducing a Third Dimension in Information Systems Designs - The Case for Incentive Alignment
- Information Systems Research
, 2001
"... this paper we outline why incentives are important in each of these areas and specify requirements for designing incentive-aligned information systems. We identify and define important unresolved problems along the incentive-alignment dimension of information systems and present a research agenda ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
this paper we outline why incentives are important in each of these areas and specify requirements for designing incentive-aligned information systems. We identify and define important unresolved problems along the incentive-alignment dimension of information systems and present a research agenda to address them
Collaborative Information Retrieval Environment: Integration of Information Retrieval with Group Support Systems
- Proceedings of the 32nd Hawaii International Conference on System Sciences. . Maui
, 1999
"... Observations of Information Retrieval (IR) system user experiences reveal a strong desire for collaborative search while at the same time suggesting that collaborative capabilities are rarely, and then only in a limited fashion, supported by current searching and visualization tools. Equally interes ..."
Abstract
-
Cited by 14 (2 self)
- Add to MetaCart
Observations of Information Retrieval (IR) system user experiences reveal a strong desire for collaborative search while at the same time suggesting that collaborative capabilities are rarely, and then only in a limited fashion, supported by current searching and visualization tools. Equally interesting is the fact that observations of user experiences with Group Support Systems (GSS) reveal that although access to external information and the ability to search for relevant material is often vital to the progress of GSS sessions, integrated support for collaborative searching and visualization of results is lacking in GSS systems. After reviewing both user experiences described in IR and GSS literature and observing and interviewing users of existing IR and GSS commercial and prototype systems, the authors conclude that there is an obvious demand for systems supporting multi-user IR.. It is surprising to the authors that very little attention has been given to the common ground shared by these two important research domains. With this in mind, our paper describes how user experiences with IR and GSS systems has shed light on a promising new area of collaborative research and led to the development of a prototype that merges the two paradigms into a Collaborative Information Retrieval Environment (CIRE). Finally the paper presents theory developed from initial user experiences with our prototype and describes plans to test the efficacy of this new paradigm empirically through controlled experimentation.
Methodological Approaches to Online Scoring of Essays
- ERIC DOCUMENT REPRODUCTION SERVICE, NO. ED
, 1997
"... ..."
Information Retrieval Through Hybrid Navigation of Lattice Representations
, 1996
"... In this paper we present a comprehensive approach to automatic organization and hybrid navigation of text databases. An organizing stage first builds a particular lattice representation of the data, through text indexing followed by lattice clustering of the indexed texts. The lattice representation ..."
Abstract
-
Cited by 13 (4 self)
- Add to MetaCart
In this paper we present a comprehensive approach to automatic organization and hybrid navigation of text databases. An organizing stage first builds a particular lattice representation of the data, through text indexing followed by lattice clustering of the indexed texts. The lattice representation, then, supports the navigation stage of the system, a visual retrieval interface that combines three main retrieval strategies: browsing, querying, and bounding. Browsing and querying are used to search the retrieval space, bounding is used to restrict it based on the information that users have, or get during their interaction with the system. We show that such a hybrid paradigm permits high flexibility in trading off information exploration and retrieval and, in addition, has good retrieval performance. We compared information retrieval using lattice-based hybrid navigation with conventional Boolean querying. The results of an experiment conducted on two medium-sized bibliographic databases showed that the performance of lattice retrieval was comparable to or better than Boolean retrieval
Medical Data Mining on the Internet: Research on a Cancer Information System
, 1999
"... This paper discusses several data mining algorithms and techniques that we have developed at the University of Arizona Artificial Intelligence Lab. We have implemented these algorithms and techniques into several prototypes, one of which focuses on medical information developed in cooperation with t ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
This paper discusses several data mining algorithms and techniques that we have developed at the University of Arizona Artificial Intelligence Lab. We have implemented these algorithms and techniques into several prototypes, one of which focuses on medical information developed in cooperation with the National Cancer Institute (NCI) and the University of Illinois at Urbana-Champaign. We propose an architecture for medical knowledge information systems that will permit data mining across several medical information sources and discuss a suite of data mining tools that we are developing to assist NCI in improving public access to and use of their existing vast cancer information collections.
Automatic Construction of Navigable Concept Networks Characterizing Text Databases
- Topics in Artificial Intelligence, LNAI 992-Springer
, 1995
"... In this paper we present a comprehensive approach to conceptual structuring and intelligent navigation of text databases. Given any collection of texts, we first automatically extract a set of index terms describing each text. Next, we use a particular lattice conceptual clustering method to build a ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
In this paper we present a comprehensive approach to conceptual structuring and intelligent navigation of text databases. Given any collection of texts, we first automatically extract a set of index terms describing each text. Next, we use a particular lattice conceptual clustering method to build a network of clustered texts whose nodes are described using the index terms. We argue that the resulting network supports an hybrid navigational approach to text retrieval - implemented into an actual user interface - that combines browsing potentials with good retrieval performance. We present the results of an experiment on subject searching where this approach outperformed a conventional Boolean retrieval system.
A comparison of feature selection methods for an evolving RSS feed corpus
- Information Processing & Management
, 2006
"... Previous researchers have attempted to detect significant topics in news stories and blogs through the use of word frequency-based methods applied to RSS feeds. In this paper, the three statistical feature selection methods: χ 2, Mutual Information (MI) and Information Gain (I) are proposed as alter ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Previous researchers have attempted to detect significant topics in news stories and blogs through the use of word frequency-based methods applied to RSS feeds. In this paper, the three statistical feature selection methods: χ 2, Mutual Information (MI) and Information Gain (I) are proposed as alternative approaches for ranking term significance in an evolving RSS feed corpus. The extent to which the three methods agree with each other on determining the degree of the significance of a term on a certain date is investigated as well as the assumption that larger values tend to indicate more significant terms. An experimental evaluation was carried out with 39 different levels of data reduction to evaluate the three methods for differing degrees of significance. The three methods showed a significant degree of disagreement for a number of terms assigned an extremely large value. Hence, the assumption that the larger a value, the higher the degree of the significance of a term should be treated cautiously. Moreover, MI and I show significant disagreement. This suggests that MI is different in the way it ranks significant terms, as MI does not take the absence of a term into account, although I does. I, however, has a higher degree of term reduction than MI and χ 2. This can result in loosing some significant terms. In summary, χ 2 seems to be the best method to determine term significance for RSS feeds, as χ 2 identifies both types of significant behavior. The χ 2 method, however, is far from perfect as an extremely high value can be assigned to relatively insignificant terms.

