Results 1  10
of
21
Transformationinvariant clustering using the EM algorithm
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2003
"... Clustering is a simple, effective way to derive useful representations of data, such as images and videos. Clustering explains the input as one of several prototypes, plus noise. In situations where each input has been randomly transformed (e.g., by translation, rotation, and shearing in images and ..."
Abstract

Cited by 59 (11 self)
 Add to MetaCart
Clustering is a simple, effective way to derive useful representations of data, such as images and videos. Clustering explains the input as one of several prototypes, plus noise. In situations where each input has been randomly transformed (e.g., by translation, rotation, and shearing in images and videos), clustering techniques tend to extract cluster centers that account for variations in the input due to transformations, instead of more interesting and potentially useful structure. For example, if images from a video sequence of a person walking across a cluttered background are clustered, it would be more useful for the different clusters to represent different poses and expressions, instead of different positions of the person and different configurations of the background clutter. We describe a way to add transformation invariance to mixture models, by approximating the nonlinear transformation manifold by a discrete set of points. We show how the expectation maximization algorithm can be used to jointly learn clusters, while at the same time inferring the transformation associated with each input. We compare this technique with other methods for filtering noisy images obtained from a scanning electron microscope, clustering images from videos of faces into different categories of identification and pose and removing foreground obstructions from video. We also demonstrate that the new technique is quite insensitive to initial conditions and works better than standard techniques, even when the standard techniques are provided with extra data.
A comparison of algorithms for inference and learning in probabilistic graphical models
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2005
"... Computer vision is currently one of the most exciting areas of artificial intelligence research, largely because it has recently become possible to record, store and process large amounts of visual data. While impressive achievements have been made in pattern classification problems such as handwr ..."
Abstract

Cited by 52 (4 self)
 Add to MetaCart
Computer vision is currently one of the most exciting areas of artificial intelligence research, largely because it has recently become possible to record, store and process large amounts of visual data. While impressive achievements have been made in pattern classification problems such as handwritten character recognition and face detection, it is even more exciting that researchers may be on the verge of introducing computer vision systems that perform scene analysis, decomposing image input into its constituent objects, lighting conditions, motion patterns, and so on. Two of the main challenges in computer vision are finding efficient models of the physics of visual scenes and finding efficient algorithms for inference and learning in these models. In this paper, we advocate the use of graphbased probability models and their associated inference and learning algorithms for computer vision and scene analysis. We review exact techniques and various approximate, computationally efficient techniques, including iterative conditional modes, the expectation maximization (EM) algorithm, the mean field method, variational techniques, structured variational techniques, Gibbs sampling, the sumproduct algorithm and “loopy ” belief propagation. We describe how each technique can be applied in a model of multiple, occluding objects, and contrast the behaviors and performances of the techniques using a unifying cost function, free energy.
Recognizing handwritten digits using hierarchical products of experts
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2001
"... Abstract—The product of experts learning procedure [1] can discover a set of stochastic binary features that constitute a nonlinear generative model of handwritten images of digits. The quality of generative models learned in this way can be assessed by learning a separate model for each class of di ..."
Abstract

Cited by 30 (5 self)
 Add to MetaCart
Abstract—The product of experts learning procedure [1] can discover a set of stochastic binary features that constitute a nonlinear generative model of handwritten images of digits. The quality of generative models learned in this way can be assessed by learning a separate model for each class of digit and then comparing the unnormalized probabilities of test images under the 10 different classspecific models. To improve discriminative performance, a hierarchy of separate models can be learned for each digit class. Each model in the hierarchy learns a layer of binary feature detectors that model the probability distribution of vectors of activity of feature detectors in the layer below. The models in the hierarchy are trained sequentially and each model uses a layer of binary feature detectors to learn a generative model of the patterns of feature activities in the preceding layer. After training, each layer of feature dectectors produces a separate, unnormalized log probabilty score. With three layers of feature detectors for each of the 10 digit classes, a test image produces 30 scores which can be used as inputs to a supervised, logistic classification network that is trained on separate data. On the MNIST database, our system is comparable with current stateoftheart discriminative methods, demonstrating that the product of experts learning procedure can produce effective hierarchies of generative models of highdimensional data. Index Terms—Neural networks, products of experts, handwriting recognition, feature extraction, shape recognition, Boltzmann machines, modelbased recognition, generative models.
machine learning, and genetic neural nets
 Advances in Applied Mathematics
, 1989
"... We consider neural nets whose connections are defined by growth rules taking the form of recursion relations. These are called genetic neural nets. Learning in these nets is achieved by simulated annealing optimization of the net over the space of recursion relation parameters. The method is tested ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
We consider neural nets whose connections are defined by growth rules taking the form of recursion relations. These are called genetic neural nets. Learning in these nets is achieved by simulated annealing optimization of the net over the space of recursion relation parameters. The method is tested on a previously defined continuous coding problem. Results of control experiments are presented so that the success of the method can be judged. Genetic neural nets implement the ideas of scaling and parsimony, features which allow generalization in machine learning. © 1989 Academic Press, Inc. 1.
Uncertainty and the Communication of Time
 Systems Research
, 1994
"... Prigogine and Stengers (1988) [47] have pointed to the centrality of the concepts of “time and eternity ” for the cosmology contained in Newtonian physics, but they have not addressed this issue beyond the domain of physics. The construction of “time ” in the cosmology dates back to debates among Hu ..."
Abstract

Cited by 7 (6 self)
 Add to MetaCart
Prigogine and Stengers (1988) [47] have pointed to the centrality of the concepts of “time and eternity ” for the cosmology contained in Newtonian physics, but they have not addressed this issue beyond the domain of physics. The construction of “time ” in the cosmology dates back to debates among Huygens, Newton, and Leibniz. The deconstruction of this cosmology in terms of the philosophical questions of the 17th century suggests an uncertainty in the time dimension. While order has been conceived as an “harmonie préétablie, ” it is considered as emergent from an evolutionary perspective. In a “chaology”, one should fully appreciate that different systems may use different clocks. Communication systems can be considered as contingent in space and time: substances contain force or action, and they communicate not only in (observable) extension, but also over time. While each communication system can be considered as a system of reference for a special theory of communication, the addition of an evolutionary perspective to the mathematical theory of communication opens up the possibility of a general theory of communication.
Recent Developments in Multilayer Perceptron Neural Networks
 Proceedings of the 7th Annual Memphis Area Engineering and Science Conference, MAESC
, 2005
"... Several neural network architectures have been developed over the past several years. One of the most popular and most powerful architectures is the multilayer perceptron. This architecture will be described in detail and recent advances in training of the multilayer perceptron will be presented. Mu ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
Several neural network architectures have been developed over the past several years. One of the most popular and most powerful architectures is the multilayer perceptron. This architecture will be described in detail and recent advances in training of the multilayer perceptron will be presented. Multilayer perceptrons are trained using various techniques. For years the most used training method was back propagation and various derivatives of this to incorporate gradient information. Recent developments have used output weight optimizationhidden weight optimization (OWOHWO) and full conjugate gradient methods. OWOHWO is a very powerful technique in terms of accuracy and rapid convergence. OWOHWO has been used with a unique “network growing ” technique to ensure that the mean square error is monotonically nonincreasing as the network size increases (i.e., the number of hidden layer nodes increases). This “network growing ” technique was trained using OWOHWO but is amenable to any training technique. This technique significantly improves training and testing performance of the MLP.
Integrating Intelligent JobScheduling Into A RealWorld ProductionScheduling System
 Journal of Intelligent Manufacturing
, 1996
"... The paper addresses the problem of scheduling production orders (jobs). First an approach based on simulated annealing and Hopfield nets is described. Since performance was unsatisfactory for realworld applications, we changed problem representation and tuned the scheduling method, dropping feature ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
The paper addresses the problem of scheduling production orders (jobs). First an approach based on simulated annealing and Hopfield nets is described. Since performance was unsatisfactory for realworld applications, we changed problem representation and tuned the scheduling method, dropping features of the Hopfield net and retaining simulated annealing. Both computing time and solution quality were significantly improved. The scheduling method was then integrated into a software system for shortterm production planning and control ("electronic leitstand"). The paper describes how realworld requirements are met and how the scheduling method interacts with the leitstand's database and graphical representation of schedules. Keywords: Scheduling, simulated annealing, electronic leitstand. 1 Approaches to Production Planning and Scheduling Software systems for production planning, scheduling, and control have been available since the late sixties. They were called MRP (Material Require...
Artificial neurons with arbitrarily complex internal structures
 Neurocomputing
"... Artificial neurons with arbitrarily complex internal structure are introduced. The neurons can be described in terms of a set of internal variables, a set activation functions which describe the time evolution of these variables and a set of characteristic functions which control how the neurons int ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Artificial neurons with arbitrarily complex internal structure are introduced. The neurons can be described in terms of a set of internal variables, a set activation functions which describe the time evolution of these variables and a set of characteristic functions which control how the neurons interact with one another. The information capacity of attractor networks composed of these generalized neurons is shown to reach the maximum allowed bound. A simple example taken from the domain of pattern recognition demonstrates the increased computational power of these neurons. Furthermore, a specific class of generalized neurons gives rise to a simple transformation relating attractor networks of generalized neurons to standard three layer feedforward networks. Given this correspondence, we conjecture that the maximum information capacity of a three layer feedforward network is 2 bits per weight.
Outlier management in intelligent data analysis
, 2000
"... In spite of many statistical methods for outlier detection and for robust analysis, there is little work on further analysis of outliers themselves to determine their origins. For example, there are “good ” outliers that provide useful information that can lead to the discovery of new knowledge, or ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
In spite of many statistical methods for outlier detection and for robust analysis, there is little work on further analysis of outliers themselves to determine their origins. For example, there are “good ” outliers that provide useful information that can lead to the discovery of new knowledge, or “bad ” outliers that include noisy data points. Successfully distinguishing between different types of outliers is an important issue in many applications, including fraud detection, medical tests, process analysis and scientific discovery. It requires not only an understanding of the mathematical properties of data but also relevant knowledge in the domain context in which the outliers occur. This thesis presents a novel attempt in automating the use of domain knowledge in helping distinguish between different types of outliers. Two complementary knowledgebased outlier analysis strategies are proposed: one using knowledge regarding how “normal data ” should be distributed in a domain of interest in order to identify “good ” outliers, and the other using the understanding of “bad ” outliers. This kind of knowledgebased outlier analysis is a useful extension to existing work in both statistical and computing communities on outlier detection.