Results 1  10
of
37
A neuropsychological theory of multiple systems in category learning
 PSYCHOLOGICAL REVIEW
, 1998
"... A neuropsychological theory is proposed that assumes category learning is a competition between separate verbal and implicit (i.e., procedurallearningbased) categorization systems. The theory assumes that the caudate nucleus is an important component of the implicit system and that the anterior ci ..."
Abstract

Cited by 229 (24 self)
 Add to MetaCart
A neuropsychological theory is proposed that assumes category learning is a competition between separate verbal and implicit (i.e., procedurallearningbased) categorization systems. The theory assumes that the caudate nucleus is an important component of the implicit system and that the anterior cingulate and prefrontal cortices are critical to the verbal system. In addition to making predictions for normal human adults, the theory makes specific predictions for children, elderly people, and patients suffering from Parkinson's disease, Huntington's disease, major depression, amnesia, or lesions of the prefrontal cortex. Two separate formal descriptions of the theory are also provided. One describes trialbytrial learning, and the other describes global dynamics. The theory is tested on published neuropsychological data and on category learning data with normal adults.
Rules and Exemplars in Category Learning
 Journal of Experimental Psychology: General
, 1998
"... haracterized by descriptions of each module and how each serves in those tasks for which it is best suited. However, these theories often do not emphasize how modules interact in producing responses and in learning. In this article we will develop a modular theory of categorization that follows fro ..."
Abstract

Cited by 142 (10 self)
 Add to MetaCart
haracterized by descriptions of each module and how each serves in those tasks for which it is best suited. However, these theories often do not emphasize how modules interact in producing responses and in learning. In this article we will develop a modular theory of categorization that follows from two distinct accounts of this behavior. The first account is that of rulebased theories of categorization. These theories emerge from a philosophical tradition in which concepts and categorization are described in terms of definitional rules. For example, if a living thing has a wide, flat tail and constructs dams by cutting down trees with its This work was supported by Indiana University Cognitive Science Program Fellowships and by NIMH ResearchTraining Grant PHST32MH1987903 to Erickson, and in part by NIMH FIRST Award 1R29MH5157201 to Kruschke. This research was reported as a poster at the 1996 Cognitive Science Society Conference in San Diego, CA. We than
Vector Quantization with Complexity Costs
, 1993
"... Vector quantization is a data compression method where a set of data points is encoded by a reduced set of reference vectors, the codebook. We discuss a vector quantization strategy which jointly optimizes distortion errors and the codebook complexity, thereby, determining the size of the codebook. ..."
Abstract

Cited by 54 (18 self)
 Add to MetaCart
Vector quantization is a data compression method where a set of data points is encoded by a reduced set of reference vectors, the codebook. We discuss a vector quantization strategy which jointly optimizes distortion errors and the codebook complexity, thereby, determining the size of the codebook. A maximum entropy estimation of the cost function yields an optimal number of reference vectors, their positions and their assignment probabilities. The dependence of the codebook density on the data density for different complexity functions is investigated in the limit of asymptotic quantization levels. How different complexity measures influence the efficiency of vector quantizers is studied for the task of image compression, i.e., we quantize the wavelet coefficients of gray level images and measure the reconstruction error. Our approach establishes a unifying framework for different quantization methods like Kmeans clustering and its fuzzy version, entropy constrained vector quantizati...
Learning Rate Schedules For Faster Stochastic Gradient Search
, 1992
"... . Stochastic gradient descent is a general algorithm that includes LMS, online backpropagation, and adaptive kmeans clustering as special cases. The standard choices of the learning rate j (both adaptive and fixed functions of time) often perform quite poorly. In contrast, our recently proposed cl ..."
Abstract

Cited by 42 (0 self)
 Add to MetaCart
. Stochastic gradient descent is a general algorithm that includes LMS, online backpropagation, and adaptive kmeans clustering as special cases. The standard choices of the learning rate j (both adaptive and fixed functions of time) often perform quite poorly. In contrast, our recently proposed class of "search then converge" (STC) learning rate schedules (Darken and Moody, 1990b, 1991) display the theoretically optimal asymptotic convergence rate and a superior ability to escape from poor local minima However, the user is responsible for setting a key parameter. We propose here a new methodology for creating the first automatically adapting learning rates that achieve the optimal rate of convergence. INTRODUCTION The stochastic gradient descent algorithm is \DeltaW (t) = \Gammajr W f [W (t); X(t)]; (1) where j is the learning rate, t is the "time", and X(t) is the independent random exemplar chosen at time t. The purpose of the algorithm is to find a parameter vector W that minim...
Scaling EM (ExpectationMaximization) Clustering to Large Databases
, 1999
"... Practical statistical clustering algorithms typically center upon an iterative refinement optimization procedure to compute a locally optimal clustering solution that maximizes the fit to data. These algorithms typically require many database scans to converge, and within each scan they require the ..."
Abstract

Cited by 40 (0 self)
 Add to MetaCart
Practical statistical clustering algorithms typically center upon an iterative refinement optimization procedure to compute a locally optimal clustering solution that maximizes the fit to data. These algorithms typically require many database scans to converge, and within each scan they require the access to every record in the data table. For large databases, the scans become prohibitively expensive. We present a scalable implementation of the ExpectationMaximization (EM) algorithm. The database community has focused on distancebased clustering schemes and methods have been developed to cluster either numerical or categorical data. Unlike distancebased algorithms (such as KMeans), EM constructs proper statistical models of the underlying data source and naturally generalizes to cluster databases containing both discretevalued and continuousvalued data. The scalable method is based on a decomposition of the basic statistics the algorithm needs: identifying regions of the data that...
Rule Inference for Financial Prediction using Recurrent Neural Networks
, 1997
"... This paper considers the prediction of noisy time series data, specifically, the prediction of foreign exchange rate data. A novel hybrid neural network algorithm for noisy time series prediction is presented which exhibits excellent performance on the problem. The method is motivated by considerati ..."
Abstract

Cited by 22 (0 self)
 Add to MetaCart
This paper considers the prediction of noisy time series data, specifically, the prediction of foreign exchange rate data. A novel hybrid neural network algorithm for noisy time series prediction is presented which exhibits excellent performance on the problem. The method is motivated by consideration of how neural networks work, and by fundamental difficulties with random correlations when dealing with small sample sizes and high noise data. The method permits the inference and extraction of rules. One of the greatest complaints against neural networks is that it is hard to figure out exactly what they are doing  this work provides one answer for the internal workings of the network. Furthermore, these rules can be used to gain insight into both the real world system and the predictor. This paper focuses on noisy time series prediction and rule inference  use of the system in trading would typically involve the utilization of other financial indicators and domain knowledge. 1 Intr...
Using curvature information for fast stochastic search
 In Advances in Neural Information Processing Systems 9
, 1996
"... We present an algorithm for fast stochastic gradient descent that uses a nonlinear adaptive momentum scheme to optimize the late time convergence rate. The algorithm makes e ective use of curvature information, requires only O(n) storage and computation, and delivers convergence rates close to the t ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
We present an algorithm for fast stochastic gradient descent that uses a nonlinear adaptive momentum scheme to optimize the late time convergence rate. The algorithm makes e ective use of curvature information, requires only O(n) storage and computation, and delivers convergence rates close to the theoretical optimum. We demonstrate the technique on linear and large nonlinear backprop networks. Improving Stochastic Search Learning algorithms that perform gradient descent on a cost function can be formulated in either stochastic (online) or batch form. The stochastic version takes the form!t+1 =!t + t G (!t�xt) (1) where!t is the current weight estimate, t is the learning rate, G is minus the instantaneous gradient estimate, and xt is the input at time t1. One obtains the corresponding batch mode learning rule by takingconstant and averaging G over
Clustering in Massive Data Sets
 Handbook of massive data sets
, 1999
"... We review the time and storage costs of search and clustering algorithms. We exemplify these, based on casestudies in astronomy, information retrieval, visual user interfaces, chemical databases, and other areas. Sections 2 to 6 relate to nearest neighbor searching, an elemental form of clustering, ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
We review the time and storage costs of search and clustering algorithms. We exemplify these, based on casestudies in astronomy, information retrieval, visual user interfaces, chemical databases, and other areas. Sections 2 to 6 relate to nearest neighbor searching, an elemental form of clustering, and a basis for clustering algorithms to follow. Sections 7 to 11 review a number of families of clustering algorithm. Sections 12 to 14 relate to visual or image representations of data sets, from which a number of interesting algorithmic developments arise.
Adaptive Computational Chemotaxis in Bacterial Foraging Optimization: An Analysis
 IEEE Computer Society Press, ISBN 0769531091
, 2008
"... Some researchers have illustrated how individual and groups of bacteria forage for nutrients and to model it as a distributed optimization process, which is called the Bacterial Foraging Optimization (BFOA). One of the major driving forces of BFOA is the chemotactic movement of a virtual bacterium, ..."
Abstract

Cited by 10 (5 self)
 Add to MetaCart
Some researchers have illustrated how individual and groups of bacteria forage for nutrients and to model it as a distributed optimization process, which is called the Bacterial Foraging Optimization (BFOA). One of the major driving forces of BFOA is the chemotactic movement of a virtual bacterium, which models a trial solution of the optimization problem. In this article, we analyze the chemotactic step of a one dimensional BFOA in the light of the classical Gradient Descent Algorithm (GDA). Our analysis points out that chemotaxis employed in BFOA may result in sustained oscillation, especially for a flat fitness landscape, when a bacterium cell is very near to the optima. To accelerate the convergence speed near optima we have made the chemotactic step size C adaptive. Computer simulations over several numerical benchmarks indicate that BFOA with the new chemotactic operation shows better convergence behavior as compared to the classical BFOA.
Transparent Fuzzy Systems: Modeling and Control
, 2002
"... During the last twenty years, fuzzy logic has been successfully applied to many modeling and control problems. One of the reasons of success is that fuzzy logic provides humanfriendly and understandable knowledge representation that can be utilized in expert knowledge extraction and implementation. ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
During the last twenty years, fuzzy logic has been successfully applied to many modeling and control problems. One of the reasons of success is that fuzzy logic provides humanfriendly and understandable knowledge representation that can be utilized in expert knowledge extraction and implementation. It is observed, however, that transparency, which is vital for undistorted information transfer, is not a default property of fuzzy systems, moreover, application of algorithms that identify fuzzy systems from data will most likely destroy any semantics a fuzzy system ever had after initialization. This thesis thoroughly investigates the issues related to transparency. Fuzzy systems are generally divided into two classes. It is shown here that for these classes different definitions of transparency apply. For standard fuzzy systems that use fuzzy propositions in IFTHEN rules, explicit transparency constraints have been derived. Based on these constraints, exploitation/modification schemes of existing identification algorithms are suggested, moreover, a new algorithm for training standard fuzzy systems has been proposed, with a considerable potential to reduce the gap between accuracy and transparency in fuzzy modeling. For 1st order TakagiSugeno systems that are interpreted in terms of local linear models, such conditions cannot be derived due to system architecture and its undesirable interpolation properties of 1st order TS systems. It is, however, possible to solve the transparency preservation problem in the context of modeling with another proposed method that benefits from rule activation degree exponents. 1st order TS systems that admit valid interpretation of local models as linearizations of the modeled system are useful, for example, in gainscheduled control. Transparent standard fuzzy systems, on the other hand, are vital to this branch of intelligent control that seeks solutions by emulating the mechanisms of reasoning and decision processes of human beings not limited to knowledgebased fuzzy control. Performing the local inversion of the modeled system it is possible to extract relevant control information, which is demonstrated with the application of fedbatch fermentation. The more a fuzzy controller resembles the experts role in a control task, the higher will be the implementation benefit of the fuzzy engine. For example, a hierarchy of fuzzy (and nonfuzzy) controllers simulates an existing hierarchy in the human decision process and leads to improved control performance. Another benefit from hierarchy is that it assumes problem decomposition. This is especially important with fuzzy logic where large number of system variables leads to exponential explosion of rules (curse of dimensionality) that makes controller design extremely difficult or even impossible. The advantages of hierarchical control are illustrated with truck backerupper applications.