Results 1 -
9 of
9
Data Structures and Algorithms for Nearest Neighbor Search in General Metric Spaces
, 1993
"... We consider the computational problem of finding nearest neighbors in general metric spaces. Of particular interest are spaces that may not be conveniently embedded or approximated in Euclidian space, or where the dimensionality of a Euclidian representation is very high. Also relevant are high-dim ..."
Abstract
-
Cited by 225 (4 self)
- Add to MetaCart
We consider the computational problem of finding nearest neighbors in general metric spaces. Of particular interest are spaces that may not be conveniently embedded or approximated in Euclidian space, or where the dimensionality of a Euclidian representation is very high. Also relevant are high-dimensional Euclidian settings in which the distribution of data is in some sense of lower dimension and embedded in the space. The vp-tree (vantage point tree) is introduced in several forms, together with associated algorithms, as an improved method for these difficult search problems. Tree construction executes in O(n log(n)) time, and search is under certain circumstances and in the limit, O(log(n)) expected time. The theoretical basis for this approach is developed and the results of several experiments are reported. In Euclidian cases, kd-tree performance is compared.
Query optimization in database systems
- ACM Computing Surveys
, 1984
"... Efficient methods of processing unanticipated queries are a crucial prerequisite for the success of generalized database management systems. A wide variety of approaches to improve the performance of query evaluation algorithms have been proposed: logic-based and semantic transformations, fast imple ..."
Abstract
-
Cited by 194 (0 self)
- Add to MetaCart
Efficient methods of processing unanticipated queries are a crucial prerequisite for the success of generalized database management systems. A wide variety of approaches to improve the performance of query evaluation algorithms have been proposed: logic-based and semantic transformations, fast implementations of basic operations, and combinatorial or heuristic algorithms for generating alternative access plans and choosing among them. These methods are presented in the framework of a general query evaluation procedure using the relational calculus representation of queries. In addition, nonstandard query optimization issues such as higher level query evaluation, query optimization in distributed databases, and use of database machines are addressed. The focus, however, is on query optimization in centralized database systems.
The TV-tree -- an index structure for high-dimensional data
- VLDB Journal
, 1994
"... We propose a file structure to index high-dimensionality data, typically, points in some feature space. The idea is to use only a few of the features, utilizing additional features whenever the additional discriminatory power is absolutely necessary. We present in detail the design of our tree struc ..."
Abstract
-
Cited by 177 (7 self)
- Add to MetaCart
We propose a file structure to index high-dimensionality data, typically, points in some feature space. The idea is to use only a few of the features, utilizing additional features whenever the additional discriminatory power is absolutely necessary. We present in detail the design of our tree structure and the associated algorithms that handle such `varying length' feature vectors. Finally we report simulation results, comparing the proposed structure with the R -tree, which is one of the most successful methods for low-dimensionality spaces. The results illustrate the superiority of our method, with up to 80% savings in disk accesses. Type of Contribution: New Index Structure, for high-dimensionality feature spaces. Algorithms and performance measurements. Keywords: Spatial Index, Similarity Retrieval, Query by Content 1 Introduction Many applications require enhanced indexing, capable of performing similarity searching on several, non-traditional (`exotic') data types. The targ...
A survey of information retrieval and filtering methods
, 1995
"... We survey the major techniques for information retrieval. In the rst part, weprovide an overview of the traditional ones (full text scanning, inversion, signature les and clustering). In the second part we discuss attempts to include semantic information (natural language processing, latent semantic ..."
Abstract
-
Cited by 82 (0 self)
- Add to MetaCart
We survey the major techniques for information retrieval. In the rst part, weprovide an overview of the traditional ones (full text scanning, inversion, signature les and clustering). In the second part we discuss attempts to include semantic information (natural language processing, latent semantic indexing and neural networks).
Machine Discovery Of Protein Motifs
- MACHINE LEARNING
, 1995
"... The investigation of relations between protein tertiary structure and amino acid sequence is a topic of tremendous importance in molecular biology. The automated discovery of recurrent patterns of structure and sequence is an essential part of this investigation. These patterns, known as protein mot ..."
Abstract
-
Cited by 17 (2 self)
- Add to MetaCart
The investigation of relations between protein tertiary structure and amino acid sequence is a topic of tremendous importance in molecular biology. The automated discovery of recurrent patterns of structure and sequence is an essential part of this investigation. These patterns, known as protein motifs, are abstractions of fragments drawn from proteins of known sequence and tertiary structure. This paper has two objectives. The first is to introduce and define protein motifs, and provide a survey of previous research on protein motif discovery. The second is to present and apply a novel approach to protein motif representation and discovery, which is based on a spatial description logic and the symbolic machine learning paradigm of structured concept formation. A large database of protein fragments is processed using this approach, and several interesting and significant protein motifs are discovered.
Dynamic Clustering Procedures For Bibliographic Data
, 1981
"... Clustering is an important tool for efficient retrieval of documents in bibliographic database systems. It can be also
used to find research trend from a set of research papers. This paper discusses new clustering procedures called dynamic ones which seem to be suitable for bibliographic
data handli ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Clustering is an important tool for efficient retrieval of documents in bibliographic database systems. It can be also
used to find research trend from a set of research papers. This paper discusses new clustering procedures called dynamic ones which seem to be suitable for bibliographic
data handling. These procedures are developed to solve the following problems.
(1) Depending on the characteristics of data, several different clustering procedures are required to obtain good results.
(2) Large clusters are tend to be generated.
Dynamic clustering procedures are difined to be procedures which change parameter values according to the characteristics of data. Similarity values and threshold
values are dynamically modified to handle the above problems. Furthermore, to treat the latter problem data duplication is considered.
Cluster-Based Adaptive Information Retrieval
"... This paper discusses the issues involved in the design of a complete information retrieval system based on useroriented clustering schemes. Clusters are constructed taking into account the users' perception of similarity between documents. The system accumulates feedback from the users and employs i ..."
Abstract
- Add to MetaCart
This paper discusses the issues involved in the design of a complete information retrieval system based on useroriented clustering schemes. Clusters are constructed taking into account the users' perception of similarity between documents. The system accumulates feedback from the users and employs it to construct useroriented clusters. An optimization function to improve the effectiveness of the clustering process is developed. A retrieval process based on the clustering scheme is described. The system developed is experimentally validated and compared with existing systems. 1 Introduction An information retrieval (ir) system is characterized by a collection of documents and a set of users who perform queries on the collection to fulfill their information needs. To improve the efficiency of retrieval, it has been proposed that the documents which are generally retrieved together in response to some query, should be kept close together within the system in the form of clusters [28, 30]...
A Multi-Agent View Of strategic planning using . . .
- GROUP DECISION AND NEGOTIATION, 5:37--59 (1996)
, 1996
"... The strategic planning process is dynamic and complex. Including a Group Support System (GSS) in the problem-solving process can improve the content quality of the strategic plan by allowing increased participation by more members of the organization. However, it can also add to the complexity of th ..."
Abstract
- Add to MetaCart
The strategic planning process is dynamic and complex. Including a Group Support System (GSS) in the problem-solving process can improve the content quality of the strategic plan by allowing increased participation by more members of the organization. However, it can also add to the complexity of the problem by increasing the quantity of textual information that can result from group activity. Added complexity increases cognitive overload and frustrations of those participants negotiating the contents of the strategic plan. This article takes a multi-agent view of the strategic planning process. It considers group participants as multiple agents concerned with the content quality of the strategic plan. The facilitator agent is responsible for guiding groups in the strategic plan construction process as well as for solving process problems such as cognitive overload. We introduce an AI Concept Categorizer agent, a software tool that supports the facilitator in addressing the process problem of cognitive overload associated with convergent group activities by synthesizing group textual output into conceptual clusters. The implementation of this tool reduces frustrations which groups encounter in the process of classifying textual output and provides more time for discussion of the concepts themselves. Because of the large amount of convergent activity necessary for strategic planning, the addition of the AI Concept Categorizer to the strategic planning process should increase the quality of the strategic plan and the buy-in of the participants in the strategic planning process.

