Results 1  10
of
2,791
Data Clustering: A Review
 ACM COMPUTING SURVEYS
, 1999
"... Clustering is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reflects its broad appeal and usefulness as one of the steps in exp ..."
Abstract

Cited by 1308 (13 self)
 Add to MetaCart
Clustering is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reflects its broad appeal and usefulness as one of the steps in exploratory data analysis. However, clustering is a difficult problem combinatorially, and differences in assumptions and contexts in different communities has made the transfer of useful generic concepts and methodologies slow to occur. This paper presents an overview of pattern clustering methods from a statistical pattern recognition perspective, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners. We present a taxonomy of clustering techniques, and identify crosscutting themes and recent advances. We also describe some important applications of clustering algorithms such as image segmentation, object recognition, and information retrieval.
Markov chains for exploring posterior distributions
 Annals of Statistics
, 1994
"... Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at ..."
Abstract

Cited by 753 (6 self)
 Add to MetaCart
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
No Free Lunch Theorems for Optimization
, 1997
"... A framework is developed to explore the connection between effective optimization algorithms and the problems they are solving. A number of “no free lunch ” (NFL) theorems are presented which establish that for any algorithm, any elevated performance over one class of problems is offset by performan ..."
Abstract

Cited by 652 (10 self)
 Add to MetaCart
A framework is developed to explore the connection between effective optimization algorithms and the problems they are solving. A number of “no free lunch ” (NFL) theorems are presented which establish that for any algorithm, any elevated performance over one class of problems is offset by performance over another class. These theorems result in a geometric interpretation of what it means for an algorithm to be well suited to an optimization problem. Applications of the NFL theorems to informationtheoretic aspects of optimization and benchmark measures of performance are also presented. Other issues addressed include timevarying optimization problems and a priori “headtohead” minimax distinctions between optimization algorithms, distinctions that result despite the NFL theorems’ enforcing of a type of uniformity over all algorithms.
Dynamic Bayesian Networks: Representation, Inference and Learning
, 2002
"... Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have bee ..."
Abstract

Cited by 565 (3 self)
 Add to MetaCart
Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs
and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linearGaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data.
In particular, the main novel technical contributions of this thesis are as follows: a way of representing
Hierarchical HMMs as DBNs, which enables inference to be done in O(T) time instead of O(T 3), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T) space instead of O(T); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of
applying RaoBlackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization
and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling.
A learning algorithm for Boltzmann machines
 Cognitive Science
, 1985
"... The computotionol power of massively parallel networks of simple processing elements resides in the communication bandwidth provided by the hardware connections between elements. These connections con allow a significant fraction of the knowledge of the system to be applied to an instance of a probl ..."
Abstract

Cited by 433 (14 self)
 Add to MetaCart
The computotionol power of massively parallel networks of simple processing elements resides in the communication bandwidth provided by the hardware connections between elements. These connections con allow a significant fraction of the knowledge of the system to be applied to an instance of a problem in o very short time. One kind of computation for which massively porollel networks appear to be well suited is large constraint satisfaction searches, but to use the connections efficiently two conditions must be met: First, a search technique that is suitable for parallel networks must be found. Second, there must be some way of choosing internal representations which allow the preexisting hardware connections to be used efficiently for encoding the constraints in the domain being searched. We describe a generol parallel search method, based on statistical mechanics, and we show how it leads to a general learning rule for modifying the connection strengths so as to incorporate knowledge obout o task domain in on efficient way. We describe some simple examples in which the learning algorithm creates internal representations thot ore demonstrobly the most efficient way of using the preexisting connectivity structure. 1.
Graph Drawing by Forcedirected Placement
, 1991
"... this paper, we introduce an algorithm that attempts to produce aestheticallypleasing, twodimensional pictures of graphs by doing simplified simulations of physical systems. We are concerned with drawing undirected graphs according to some generally accepted aesthetic criteria: 1. Distribute the v ..."
Abstract

Cited by 429 (0 self)
 Add to MetaCart
this paper, we introduce an algorithm that attempts to produce aestheticallypleasing, twodimensional pictures of graphs by doing simplified simulations of physical systems. We are concerned with drawing undirected graphs according to some generally accepted aesthetic criteria: 1. Distribute the vertices evenly in the frame. 2. Minimize edge crossings. 3. Make edge lengths uniform. 4. Reflect inherent symmetry. 5. Conform to the frame. Our algorithm does not explicitly strive for these goals, but does well at distributing vertices evenly, making edge lengths uniform, and reflecting symmetry. Our goals for the implementation are speed and simplicity. PREVIOUS WORK Our algorithm for drawing undirected graphs is based on the work of Eades which, in turn, evolved from a VLSI technique called forcedirected placement
Incorporating nonlocal information into information extraction systems by gibbs sampling
 In ACL
, 2005
"... Most current statistical natural language processing models use only local features so as to permit dynamic programming in inference, but this makes them unable to fully account for the long distance structure that is prevalent in language use. We show how to solve this dilemma with Gibbs sampling, ..."
Abstract

Cited by 392 (20 self)
 Add to MetaCart
Most current statistical natural language processing models use only local features so as to permit dynamic programming in inference, but this makes them unable to fully account for the long distance structure that is prevalent in language use. We show how to solve this dilemma with Gibbs sampling, a simple Monte Carlo method used to perform approximate inference in factored probabilistic models. By using simulated annealing in place of Viterbi decoding in sequence models such as HMMs, CMMs, and CRFs, it is possible to incorporate nonlocal structure while preserving tractable inference. We use this technique to augment an existing CRFbased information extraction system with longdistance dependency models, enforcing label consistency and extraction template consistency constraints. This technique results in an error reduction of up to 9 % over stateoftheart systems on two established information extraction tasks. 1
GSAT and Dynamic Backtracking
 Journal of Artificial Intelligence Research
, 1994
"... There has been substantial recent interest in two new families of search techniques. One family consists of nonsystematic methods such as gsat; the other contains systematic approaches that use a polynomial amount of justification information to prune the search space. This paper introduces a new te ..."
Abstract

Cited by 360 (14 self)
 Add to MetaCart
There has been substantial recent interest in two new families of search techniques. One family consists of nonsystematic methods such as gsat; the other contains systematic approaches that use a polynomial amount of justification information to prune the search space. This paper introduces a new technique that combines these two approaches. The algorithm allows substantial freedom of movement in the search space but enough information is retained to ensure the systematicity of the resulting analysis. Bounds are given for the size of the justification database and conditions are presented that guarantee that this database will be polynomial in the size of the problem in question. 1 INTRODUCTION The past few years have seen rapid progress in the development of algorithms for solving constraintsatisfaction problems, or csps. Csps arise naturally in subfields of AI from planning to vision, and examples include propositional theorem proving, map coloring and scheduling problems. The probl...