Results 1  10
of
110
A tutorial on support vector machines for pattern recognition
 Data Mining and Knowledge Discovery
, 1998
"... The tutorial starts with an overview of the concepts of VC dimension and structural risk minimization. We then describe linear Support Vector Machines (SVMs) for separable and nonseparable data, working through a nontrivial example in detail. We describe a mechanical analogy, and discuss when SV ..."
Abstract

Cited by 2295 (11 self)
 Add to MetaCart
The tutorial starts with an overview of the concepts of VC dimension and structural risk minimization. We then describe linear Support Vector Machines (SVMs) for separable and nonseparable data, working through a nontrivial example in detail. We describe a mechanical analogy, and discuss when SVM solutions are unique and when they are global. We describe how support vector training can be practically implemented, and discuss in detail the kernel mapping technique which is used to construct SVM solutions which are nonlinear in the data. We show how Support Vector machines can have very large (even infinite) VC dimension by computing the VC dimension for homogeneous polynomial and Gaussian radial basis function kernels. While very high VC dimension would normally bode ill for generalization performance, and while at present there exists no theory which shows that good generalization performance is guaranteed for SVMs, there are several arguments which support the observed high accuracy of SVMs, which we review. Results of some experiments which were inspired by these arguments are also presented. We give numerous examples and proofs of most of the key theorems. There is new material, and I hope that the reader will find that even old material is cast in a fresh light.
On the mathematical foundations of learning
 Bulletin of the American Mathematical Society
, 2002
"... The problem of learning is arguably at the very core of the problem of intelligence, both biological and arti cial. T. Poggio and C.R. Shelton ..."
Abstract

Cited by 224 (12 self)
 Add to MetaCart
The problem of learning is arguably at the very core of the problem of intelligence, both biological and arti cial. T. Poggio and C.R. Shelton
Stable Function Approximation in Dynamic Programming
 IN MACHINE LEARNING: PROCEEDINGS OF THE TWELFTH INTERNATIONAL CONFERENCE
, 1995
"... The success of reinforcement learning in practical problems depends on the ability tocombine function approximation with temporal difference methods such as value iteration. Experiments in this area have produced mixed results; there have been both notable successes and notable disappointments. Theo ..."
Abstract

Cited by 210 (5 self)
 Add to MetaCart
The success of reinforcement learning in practical problems depends on the ability tocombine function approximation with temporal difference methods such as value iteration. Experiments in this area have produced mixed results; there have been both notable successes and notable disappointments. Theory has been scarce, mostly due to the difficulty of reasoning about function approximators that generalize beyond the observed data. We provide a proof of convergence for a wide class of temporal difference methods involving function approximators such as knearestneighbor, and show experimentally that these methods can be useful. The proof is based on a view of function approximators as expansion or contraction mappings. In addition, we present a novel view of approximate value iteration: an approximate algorithm for one environment turns out to be an exact algorithm for a different environment.
Structured Semidefinite Programs and Semialgebraic Geometry Methods in Robustness and Optimization
, 2000
"... ..."
Single Crossing Properties And The Existence Of Pure Strategy Equilibria In Games Of Incomplete Information
 Econometrica
, 1997
"... This paper analyzes a class of games of incomplete information where each agent has ..."
Abstract

Cited by 122 (6 self)
 Add to MetaCart
This paper analyzes a class of games of incomplete information where each agent has
Learning and Design of Principal Curves
, 2000
"... Principal curves have been defined as ``self consistent'' smooth curves which pass through the ``middle'' of a $d$dimensional probability distribution or data cloud. They give a summary of the data and also serve as an efficient feature extraction tool. We take a new approach by defining principal ..."
Abstract

Cited by 75 (5 self)
 Add to MetaCart
Principal curves have been defined as ``self consistent'' smooth curves which pass through the ``middle'' of a $d$dimensional probability distribution or data cloud. They give a summary of the data and also serve as an efficient feature extraction tool. We take a new approach by defining principal curves as continuous curves of a given length which minimize the expected squared distance between the curve and points of the space randomly chosen according to a given distribution. The new definition makes it possible to theoretically analyze principal curve learning from training data and it also leads to a new practical construction. Our theoretical learning scheme chooses a curve from a class of polygonal lines with $k$ segments and with a given total length, to minimize the average squared distance over $n$ training points drawn independently. Convergence properties of this learning scheme are analyzed and a practical version of this theoretical algorithm is implemented. In each iteration of the algorithm a new vertex is added to the polygonal line and the positions of the vertices are updated so that they minimize a penalized squared distance criterion. Simulation results demonstrate that the new algorithm compares favorably with previous methods both in terms of performance and computational complexity, and is more robust to varying data models.
Constructive Algorithms for Structure Learning in Feedforward Neural Networks for Regression Problems
 IEEE Transactions on Neural Networks
, 1997
"... In this survey paper, we review the constructive algorithms for structure learning in feedforward neural networks for regression problems. The basic idea is to start with a small network, then add hidden units and weights incrementally until a satisfactory solution is found. By formulating the whole ..."
Abstract

Cited by 66 (2 self)
 Add to MetaCart
In this survey paper, we review the constructive algorithms for structure learning in feedforward neural networks for regression problems. The basic idea is to start with a small network, then add hidden units and weights incrementally until a satisfactory solution is found. By formulating the whole problem as a state space search, we first describe the general issues in constructive algorithms, with special emphasis on the search strategy. A taxonomy, based on the differences in the state transition mapping, the training algorithm and the network architecture, is then presented. Keywords Constructive algorithm, structure learning, state space search, dynamic node creation, projection pursuit regression, cascadecorrelation, resourceallocating network, group method of data handling. I. Introduction A. Problems with Fixed Size Networks I N recent years, many neural network models have been proposed for pattern classification, function approximation and regression problems. Among...
On a Reduced Load Equivalence for Fluid Queues Under Subexponentiality
, 1998
"... We propose a general framework for obtaining asymptotic distributional bounds on the stationary backlog W A1+A2 ;c in a buffer fed by a combined fluid process A 1 + A 2 and drained at a constant rate c. The fluid process A 1 is an (independent) onoff source with average and peak rates ae 1 and r ..."
Abstract

Cited by 29 (0 self)
 Add to MetaCart
We propose a general framework for obtaining asymptotic distributional bounds on the stationary backlog W A1+A2 ;c in a buffer fed by a combined fluid process A 1 + A 2 and drained at a constant rate c. The fluid process A 1 is an (independent) onoff source with average and peak rates ae 1 and r 1 , respectively, and with distribution G for the activity periods. The fluid process A 2 of average rate ae 2 is arbitrary but independent of A 1 . These bounds are used to identify subexponential distributions G and fairly general fluid processes A 2 such that the asymptotic equivalence P \Theta W A1+A2 ;c ? x P \Theta W A1 ;c\Gammaae 2 ? x (x ! 1) holds under the stability condition ae 1 + ae 2 ! c and under the nontriviality condition c \Gamma ae 2 ! r 1 . The stationary backlog W A1 ;c\Gammaae 2 in these asymptotics results from feeding source A 1 into a buffer drained at reduced rate c \Gamma ae 2 . This reduced load asymptotic equivalence extends to a larger class o...
Constructive Feedforward Neural Networks for Regression Problems: A Survey
, 1995
"... In this paper, we review the procedures for constructing feedforward neural networks in regression problems. While standard backpropagation performs gradient descent only in the weight space of a network with fixed topology, constructive procedures start with a small network and then grow additiona ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
In this paper, we review the procedures for constructing feedforward neural networks in regression problems. While standard backpropagation performs gradient descent only in the weight space of a network with fixed topology, constructive procedures start with a small network and then grow additional hidden units and weights until a satisfactory solution is found. The constructive procedures are categorized according to the resultant network architecture and the learning algorithm for the network weights. The Hong Kong University of Science & Technology Technical Report Series Department of Computer Science 1 Introduction In recent years, many neural network models have been proposed for pattern classification, function approximation and regression problems. Among them, the class of multilayer feedforward networks is perhaps the most popular. Standard backpropagation performs gradient descent only in the weight space of a network with fixed topology; this approach is analogous to ...
Consensus Computation in Unreliable Networks: A System Theoretic Approach
, 2011
"... This work considers the problem of reaching consensus in an unreliable linear consensus network. A solution to this problem is relevant for several tasks in multiagent systems including motion coordination, clock synchronization, and cooperative estimation. By modeling the unreliable nodes as unkno ..."
Abstract

Cited by 21 (7 self)
 Add to MetaCart
This work considers the problem of reaching consensus in an unreliable linear consensus network. A solution to this problem is relevant for several tasks in multiagent systems including motion coordination, clock synchronization, and cooperative estimation. By modeling the unreliable nodes as unknown and unmeasurable inputs affecting the network, we recast the problem into an unknowninput system theoretic framework. Only relying on their direct measurements, the agents detect and identify the misbehaving agents using fault detection and isolation techniques. We consider both the case that misbehaviors are simply caused by faults, or that they are the product of a definite, malignant “Byzantine ” strategy. We express the solvability conditions of the two cases in a system theoretic framework, and from a graph theoretic perspective. We show that generically any node can correctly detect and identify the misbehaving agents, provided that the connectivity of the network is sufficiently high. Precisely, for a linear consensus network to be generically resilient to k concurrent faults, the connectivity of the communication graph needs to be 2k + 1, if Byzantine agents are allowed, and k + 1, if noncolluding agents are considered. We finally provide algorithms for detecting and isolating misbehaving agents. The first procedure applies standard fault detection techniques, and affords complete intrusion detection if global knowledge of the graph is available to each agent, at a high computational cost. The second method is designed to exploit the presence in a network of weakly interconnected subparts, and provides computationally efficient detection of misbehaving agents whose behavior deviates more than a threshold, which is quantified in terms of the interconnection structure.