Results 1  10
of
379
Survey of clustering algorithms
 IEEE TRANSACTIONS ON NEURAL NETWORKS
, 2005
"... Data analysis plays an indispensable role for understanding various phenomena. Cluster analysis, primitive exploration with little or no prior knowledge, consists of research developed across a wide variety of communities. The diversity, on one hand, equips us with many tools. On the other hand, the ..."
Abstract

Cited by 231 (3 self)
 Add to MetaCart
Data analysis plays an indispensable role for understanding various phenomena. Cluster analysis, primitive exploration with little or no prior knowledge, consists of research developed across a wide variety of communities. The diversity, on one hand, equips us with many tools. On the other hand, the profusion of options causes confusion. We survey clustering algorithms for data sets appearing in statistics, computer science, and machine learning, and illustrate their applications in some benchmark data sets, the traveling salesman problem, and bioinformatics, a new field attracting intensive efforts. Several tightly related topics, proximity measure, and cluster validation, are also discussed.
Improved heterogeneous distance functions
 Journal of Artificial Intelligence Research
, 1997
"... Instancebased learning techniques typically handle continuous and linear input values well, but often do not handle nominal input attributes appropriately. The Value Difference Metric (VDM) was designed to find reasonable distance values between nominal attribute values, but it largely ignores cont ..."
Abstract

Cited by 199 (10 self)
 Add to MetaCart
Instancebased learning techniques typically handle continuous and linear input values well, but often do not handle nominal input attributes appropriately. The Value Difference Metric (VDM) was designed to find reasonable distance values between nominal attribute values, but it largely ignores continuous attributes, requiring discretization to map continuous values into nominal values. This paper proposes three new heterogeneous distance functions, called the Heterogeneous Value Difference Metric (HVDM), the Interpolated Value Difference Metric (IVDM), and the Windowed Value Difference Metric (WVDM). These new distance functions are designed to handle applications with nominal attributes, continuous attributes, or both. In experiments on 48 applications the new distance metrics achieve higher classification accuracy on average than three previous distance functions on those datasets that have both nominal and continuous attributes.
The Helmholtz Machine
, 1995
"... Discovering the structure inherent in a set of patterns is a fundamental aim of statistical inference or learning. One fruitful approach is to build a parameterized stochastic generative model, independent draws from which are likely to produce the patterns. For all but the simplest generative model ..."
Abstract

Cited by 194 (22 self)
 Add to MetaCart
Discovering the structure inherent in a set of patterns is a fundamental aim of statistical inference or learning. One fruitful approach is to build a parameterized stochastic generative model, independent draws from which are likely to produce the patterns. For all but the simplest generative models, each pattern can be generated in exponentially many ways. It is thus intractable to adjust the parameters to maximize the probability of the observed patterns. We describe a way of finessing this combinatorial explosion by maximizing an easily computed lower bound on the probability of the observations. Our method can be viewed as a form of hierarchical selfsupervised learning that may relate to the function of bottomup and topdown cortical processing pathways.
Nonlinear Neural Networks: Principles, Mechanisms, and Architectures
, 1988
"... An historical discussion is provided of the intellectual trends that caused nineteenth century interdisciplinary studies of physics and psychobiology by leading scientists such as Helmholtz, Maxwell, and Mach to splinter into separate twentiethcentury scientific movements. The nonlinear, nonstatio ..."
Abstract

Cited by 181 (20 self)
 Add to MetaCart
An historical discussion is provided of the intellectual trends that caused nineteenth century interdisciplinary studies of physics and psychobiology by leading scientists such as Helmholtz, Maxwell, and Mach to splinter into separate twentiethcentury scientific movements. The nonlinear, nonstationary, and nonlocal nature of behavioral and brain data are emphasized. Three sources of contemporary neural network researchthe binary, linear, and continuousnonlinear modelsare noted. The remainder of the article describes results about continuousnonlinear models: Many models of contentaddressable memory are shown to be special cases of the CohenGrossberg model and global Liapunov function, including the additive, brainstateinabox, McCullochPitts, Boltzmann machine, HartlineRatliffMillet; shunting, maskingfield, bidirectional associative memory, VolterraLotka, GilpinAyala, and EigenSchuster models. A Liapunov functional method is described for proving global limit or oscillation theorems for nonlinear competitive systems when their decision schemes are globally consistent or inconsistent, respectively. The former case is illustrated by a model of a globally stable economic market, and the latter case is illustrated by a model of the voting paradox. Key properties of shunting competitive feedback networks are summarized, including the role of sigmoid signalling, automatic gain control, competitive choice and quantization, tunable filtering, total activity normalization, and noise suppression in pattern transformation and memory storage applications. Connections to models of competitive learning, vector quantization, and categorical perception are noted. Adaptive resonance
Hierarchical Bayesian Inference in the Visual Cortex
, 2002
"... this paper, we propose a Bayesian theory of hierarchical cortical computation based both on (a) the mathematical and computational ideas of computer vision and pattern the ory and on (b) recent neurophysiological experimental evidence. We ,2 have proposed that Grenander's pattern theory 3 could pot ..."
Abstract

Cited by 173 (0 self)
 Add to MetaCart
this paper, we propose a Bayesian theory of hierarchical cortical computation based both on (a) the mathematical and computational ideas of computer vision and pattern the ory and on (b) recent neurophysiological experimental evidence. We ,2 have proposed that Grenander's pattern theory 3 could potentially model the brain as a generafive model in such a way that feedback serves to disambiguate and 'explain away' the earlier representa tion. The Helmholtz machine 4, 5 was an excellent step towards approximating this proposal, with feedback implementing priors. Its development, however, was rather limited, dealing only with binary images. Moreover, its feedback mechanisms were engaged only during the learning of the feedforward connections but not during perceptual inference, though the Gibbs sampling process for inference can potentially be interpreted as topdown feedback disambiguating low level representations? Rao and Ballard's predictive coding/Kalman filter model 6 did integrate generafive feedback in the perceptual inference process, but it was primarily a linear model and thus severely limited in practical utility. The datadriven Markov Chain Monte Carlo approach of Zhu and colleagues 7, 8 might be the most successful recent application of this proposal in solving real and difficult computer vision problems using generafive models, though its connection to the visual cortex has not been explored. Here, we bring in a powerful and widely applicable paradigm from artificial intelligence and computer vision to propose some new ideas about the algorithms of visual cortical process ing and the nature of representations in the visual cortex. We will review some of our and others' neurophysiological experimental data to lend support to these ideas
Constructive Incremental Learning from Only Local Information
, 1998
"... ... This article illustrates the potential learning capabilities of purely local learning and offers an interesting and powerful approach to learning with receptive fields. ..."
Abstract

Cited by 160 (37 self)
 Add to MetaCart
... This article illustrates the potential learning capabilities of purely local learning and offers an interesting and powerful approach to learning with receptive fields.
Bidirectional Associative Memories
 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS
, 1988
"... Stability and encoding properties of twolayer nonlinear feedback neural networks are examined. Bidirectionality, forward and backard information flow, is introduced in neural nets to produce twoway associative search for stored associations (A, B, ). Passing information through M gives one directi ..."
Abstract

Cited by 155 (3 self)
 Add to MetaCart
Stability and encoding properties of twolayer nonlinear feedback neural networks are examined. Bidirectionality, forward and backard information flow, is introduced in neural nets to produce twoway associative search for stored associations (A, B, ). Passing information through M gives one direction; passing it through its transpose M r gives the other. A bidirectional associative memory. (BAM) behaves as a hetero associative content addressable memory (CAM), storing and recalling the vector pairs (A1, Bi),..,(Am Bin) , where .4 {0,1}"and B We prove that every nbyp matrix M is a bidirectionally stable heteroas sociative CAM for both binary/bipolar and continuous neurons a, and hi. When the BAM neurons are activated, the network quickly evolves to a stable state of twopattern reverberation, or resonance. The stable reverberation corresponds to a system energy local minimum. Heteroassociafive inlormation is encoded iu a BAM by summing correlation matrices. The BAM storage capact .ty for reliable recall is roughly m < niin(n, p). No more heteroassociafive pairs can be 'reliably stored and recalled than the lesser of the dimensions of the pattern spaces (0,1 }"and 0,1 } P. The Appendix shos that it is better on average to use bipolar { 1,i} coding than binary. {0,1 } coding of heteroassociative pairs (.4, B,). BAM encoding and decoding are combined in the adaptive BAM, which extends global bidirectional stabflit), to realtime unsupervised learning. Temporal patterns (AE,., A,,) are represented as ordered lists of binary/bipolar vectors and stored in a temporal associative memory (TAM) nby matrix M as a limit cycle of the dynamical system. Forward recall proceeds through M, backward recall through M r . Temporal patterns are stored by summing contiguous bipolar...
A survey of outlier detection methodologies
 Artificial Intelligence Review
, 2004
"... Abstract. Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populat ..."
Abstract

Cited by 153 (3 self)
 Add to MetaCart
Abstract. Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review.
T.R.: Reduction Techniques for Instancebased Learning Algorithm
 Machine Learning
"... Abstract. Instancebased learning algorithms are often faced with the problem of deciding which instances to store for use during generalization. Storing too many instances can result in large memory requirements and slow execution speed, and can cause an oversensitivity to noise. This paper has two ..."
Abstract

Cited by 130 (2 self)
 Add to MetaCart
Abstract. Instancebased learning algorithms are often faced with the problem of deciding which instances to store for use during generalization. Storing too many instances can result in large memory requirements and slow execution speed, and can cause an oversensitivity to noise. This paper has two main purposes. First, it provides a survey of existing algorithms used to reduce storage requirements in instancebased learning algorithms and other exemplarbased algorithms. Second, it proposes six additional reduction algorithms called DROP1–DROP5 and DEL (three of which were first described in Wilson & Martinez, 1997c, as RT1–RT3) that can be used to remove instances from the concept description. These algorithms and 10 algorithms from the survey are compared on 31 classification tasks. Of those algorithms that provide substantial storage reduction, the DROP algorithms have the highest average generalization accuracy in these experiments, especially in the presence of uniform class noise. Keywords: instancebased learning, nearest neighbor, instance reduction, pruning, classification
Connectionist models of recognition memory: Constraints imposed by learning and forgetting functions
 Psychological Review
, 1990
"... Multilayer connectionist models of memory based on the encoder model using the backpropagation learning rule are evaluated. The models are applied to standard recognition memory procedures in which items are studied sequentially and then tested for retention. Sequential learning in these models lead ..."
Abstract

Cited by 102 (4 self)
 Add to MetaCart
Multilayer connectionist models of memory based on the encoder model using the backpropagation learning rule are evaluated. The models are applied to standard recognition memory procedures in which items are studied sequentially and then tested for retention. Sequential learning in these models leads to 2 major problems. First, welllearned information is forgotten rapidly as new information is learned. Second, discrimination between studied items and new items either decreases or is nonmonotonic as a function of learning. To address these problems, manipulations of the network within the multilayer model and several variants of the multilayer model were examined, including a model with prelearned memory and a context model, but none solved the problems. The problems discussed provide limitations on connectionist models applied to human memory and in tasks where information to be learned is not all available during learning. The first stage of the connectionist revolution in psychology is reaching maturity and perhaps drawing to an end. This stage has been concerned with the exploration of classes of models, and the criteria that have been used to evaluate the success of an application have been necessarily loose. In the early stages