Results 1 - 10
of
288
Database Mining: A Performance Perspective
- IEEE Transactions on Knowledge and Data Engineering
, 1993
"... We present our perspective of database mining as the confluence of machine learning techniques and the performance emphasis of database technology. We describe three classes of database mining problems involving classification, associations, and sequences, and argue that these problems can be unifor ..."
Abstract
-
Cited by 247 (12 self)
- Add to MetaCart
We present our perspective of database mining as the confluence of machine learning techniques and the performance emphasis of database technology. We describe three classes of database mining problems involving classification, associations, and sequences, and argue that these problems can be uniformly viewed as requiring discovery of rules embedded in massive data. We describe a model and some basic operations for the process of rule discovery. We show how the database mining problems we consider map to this model and how they can be solved by using the basic operations we propose. We give an example of an algorithm for classification obtained by combining the basic rule discovery operations. This algorithm not only is efficient in discovering classification rules but also has accuracy comparable to ID3, one of the current best classifiers. Index Terms. database mining, knowledge discovery, classification, associations, sequences, decision trees Current address: Computer Science De...
SPRINT: A scalable parallel classifier for data mining
, 1996
"... Classification is an important data mining problem. Although classification is a well-studied problem, most of the current classi-fication algorithms require that all or a por-tion of the the entire dataset remain perma-nently in memory. This limits their suitability for mining over large databases. ..."
Abstract
-
Cited by 228 (7 self)
- Add to MetaCart
Classification is an important data mining problem. Although classification is a well-studied problem, most of the current classi-fication algorithms require that all or a por-tion of the the entire dataset remain perma-nently in memory. This limits their suitability for mining over large databases. We present a new decision-tree-based classification algo-rithm, called SPRINT that removes all of the memory restrictions, and is fast and scalable. The algorithm has also been designed to be easily parallelized, allowing many processors to work together to build a single consistent model. This parallelization, also presented here, exhibits excellent scalability as well. The combination of these characteristics makes the proposed algorithm an ideal tool for data min-ing. 1
Optimal Unsupervised Learning in a Single-Layer Linear Feedforward Neural Network
, 1989
"... A new approach to unsupervised learning in a single-layer linear feedforward neural network is discussed. An optimality principle is proposed which is based upon preserving maximal information in the output units. An algorithm for unsupervised learning based upon a Hebbian learning rule, which achie ..."
Abstract
-
Cited by 189 (0 self)
- Add to MetaCart
A new approach to unsupervised learning in a single-layer linear feedforward neural network is discussed. An optimality principle is proposed which is based upon preserving maximal information in the output units. An algorithm for unsupervised learning based upon a Hebbian learning rule, which achieves the desired optimality is presented, The algorithm finds the eigenvectors of the input correlation matrix, and it is proven to converge with probability one. An implementation which can train neural networks using only local "synaptic" modification rules is described. It is shown that the algorithm is closely related to algorithms in statistics (Factor Analysis and Principal Components Analysis) and neural networks (Self-supervised Backpropagation, or the "encoder" problem). It thus provides an explanation of certain neural network behavior in terms of classical statistical techniques. Examples of the use of a linear network for solving image coding and texture segmentation problems are presented. Also, it is shown that the algorithm can be used to find "visual receptive fields" which are qualitatively similar to those found in primate retina and visual cortex.
On The Computational Power Of Neural Nets
- JOURNAL OF COMPUTER AND SYSTEM SCIENCES
, 1995
"... This paper deals with finite size networks which consist of interconnections of synchronously evolving processors. Each processor updates its state by applying a "sigmoidal" function to a linear combination of the previous states of all units. We prove that one may simulate all Turing Machines by su ..."
Abstract
-
Cited by 139 (23 self)
- Add to MetaCart
This paper deals with finite size networks which consist of interconnections of synchronously evolving processors. Each processor updates its state by applying a "sigmoidal" function to a linear combination of the previous states of all units. We prove that one may simulate all Turing Machines by such nets. In particular, one can simulate any multi-stack Turing Machine in real time, and there is a net made up of 886 processors which computes a universal partial-recursive function. Products (high order nets) are not required, contrary to what had been stated in the literature. Non-deterministic Turing Machines can be simulated by non-deterministic rational nets, also in real time. The simulation result has many consequences regarding the decidability, or more generally the complexity, of questions about recursive nets.
Algorithms for the Satisfiability (SAT) Problem: A Survey
- DIMACS Series in Discrete Mathematics and Theoretical Computer Science
, 1996
"... . The satisfiability (SAT) problem is a core problem in mathematical logic and computing theory. In practice, SAT is fundamental in solving many problems in automated reasoning, computer-aided design, computeraided manufacturing, machine vision, database, robotics, integrated circuit design, compute ..."
Abstract
-
Cited by 107 (3 self)
- Add to MetaCart
. The satisfiability (SAT) problem is a core problem in mathematical logic and computing theory. In practice, SAT is fundamental in solving many problems in automated reasoning, computer-aided design, computeraided manufacturing, machine vision, database, robotics, integrated circuit design, computer architecture design, and computer network design. Traditional methods treat SAT as a discrete, constrained decision problem. In recent years, many optimization methods, parallel algorithms, and practical techniques have been developed for solving SAT. In this survey, we present a general framework (an algorithm space) that integrates existing SAT algorithms into a unified perspective. We describe sequential and parallel SAT algorithms including variable splitting, resolution, local search, global optimization, mathematical programming, and practical SAT algorithms. We give performance evaluation of some existing SAT algorithms. Finally, we provide a set of practical applications of the sat...
Rewriting Logic as a Semantic Framework for Concurrency: a Progress Report
, 1996
"... . This paper surveys the work of many researchers on rewriting logic since it was first introduced in 1990. The main emphasis is on the use of rewriting logic as a semantic framework for concurrency. The goal in this regard is to express as faithfully as possible a very wide range of concurrency mod ..."
Abstract
-
Cited by 78 (22 self)
- Add to MetaCart
. This paper surveys the work of many researchers on rewriting logic since it was first introduced in 1990. The main emphasis is on the use of rewriting logic as a semantic framework for concurrency. The goal in this regard is to express as faithfully as possible a very wide range of concurrency models, each on its own terms, avoiding any encodings or translations. Bringing very different models under a common semantic framework makes easier to understand what different models have in common and how they differ, to find deep connections between them, and to reason across their different formalisms. It becomes also much easier to achieve in a rigorous way the integration and interoperation of different models and languages whose combination offers attractive advantages. The logic and model theory of rewriting logic are also summarized, a number of current research directions are surveyed, and some concluding remarks about future directions are made. Table of Contents 1 In...
Meta-Learning in Distributed Data Mining Systems: Issues and Approaches
- Advances of Distributed Data Mining
, 2000
"... Data mining systems aim to discover patterns and extract useful information from facts recorded in databases. A widely adopted approach to this objective is to apply various machine learning algorithms to compute descriptive models of the available data. Here, we explore one of the main challeng ..."
Abstract
-
Cited by 71 (0 self)
- Add to MetaCart
Data mining systems aim to discover patterns and extract useful information from facts recorded in databases. A widely adopted approach to this objective is to apply various machine learning algorithms to compute descriptive models of the available data. Here, we explore one of the main challenges in this research area, the development of techniques that scale up to large and possibly physically distributed databases. Meta-learning is a technique that seeks to compute higher-level classifiers (or classification models), called meta-classifiers, that integrate in some principled fashion multiple classifiers computed separately over different databases. This study, describes meta-learning and presents the JAM system (Java Agents for Meta-learning), an agent-based meta-learning system for large-scale data mining applications. Specifically, it identifies and addresses several important desiderata for distributed data mining systems that stem from their additional complexity co...
Bounds for the Computational Power and Learning Complexity of Analog Neural Nets
- Proc. of the 25th ACM Symp. Theory of Computing
, 1993
"... . It is shown that high order feedforward neural nets of constant depth with piecewise polynomial activation functions and arbitrary real weights can be simulated for boolean inputs and outputs by neural nets of a somewhat larger size and depth with heaviside gates and weights from f\Gamma1; 0; 1g. ..."
Abstract
-
Cited by 59 (12 self)
- Add to MetaCart
. It is shown that high order feedforward neural nets of constant depth with piecewise polynomial activation functions and arbitrary real weights can be simulated for boolean inputs and outputs by neural nets of a somewhat larger size and depth with heaviside gates and weights from f\Gamma1; 0; 1g. This provides the first known upper bound for the computational power of the former type of neural nets. It is also shown that in the case of first order nets with piecewise linear activation functions one can replace arbitrary real weights by rational numbers with polynomially many bits, without changing the boolean function that is computed by the neural net. In order to prove these results we introduce two new methods for reducing nonlinear problems about weights in multi-layer neural nets to linear problems for a transformed set of parameters. These transformed parameters can be interpreted as weights in a somewhat larger neural net. As another application of our new proof technique we s...
Human Face Recognition and the Face Image Set's Topology
"... If we consider an n x n image as an n 2 dimensional vector, then images of faces can be considered as points in this n 2 -dimensional image space. Our previous studies of physical transformations of the face, including translation, small rotations and illumination changes, showed that the set of ..."
Abstract
-
Cited by 51 (8 self)
- Add to MetaCart
If we consider an n x n image as an n 2 dimensional vector, then images of faces can be considered as points in this n 2 -dimensional image space. Our previous studies of physical transformations of the face, including translation, small rotations and illumination changes, showed that the set of face images consists of relatively simple connected sub-regions in image space [1]. Consequently linear matching techniques can be used to obtain reliable face recognition. However for more general transformations, such as large rotations or scale changes, the face subregions become highly non-convex. We have therefore developed a scale-space matching technique that allows us to take advantage of knowledge about important geometrical transformations and about the topology of the face subregion in image space. While recognition of faces is the focus of this paper, the algorithm is sufficiently general to be applicable to a large variety of object recognition tasks. List of Symbols ffi: Gr...

