Results 1  10
of
161
Choosing multiple parameters for support vector machines
 Machine Learning
, 2002
"... Abstract. The problem of automatically tuning multiple parameters for pattern recognition Support Vector Machines (SVMs) is considered. This is done by minimizing some estimates of the generalization error of SVMs using a gradient descent algorithm over the set of parameters. Usual methods for choos ..."
Abstract

Cited by 303 (15 self)
 Add to MetaCart
Abstract. The problem of automatically tuning multiple parameters for pattern recognition Support Vector Machines (SVMs) is considered. This is done by minimizing some estimates of the generalization error of SVMs using a gradient descent algorithm over the set of parameters. Usual methods for choosing parameters, based on exhaustive search become intractable as soon as the number of parameters exceeds two. Some experimental results assess the feasibility of our approach for a large number of parameters (more than 100) and demonstrate an improvement of generalization performance.
Consistency of spectral clustering
, 2004
"... Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spe ..."
Abstract

Cited by 302 (15 self)
 Add to MetaCart
Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spectral clustering algorithms, which cluster the data with the help of eigenvectors of graph Laplacian matrices. We show that one of the two of major classes of spectral clustering (normalized clustering) converges under some very general conditions, while the other (unnormalized), is only consistent under strong additional assumptions, which, as we demonstrate, are not always satisfied in real data. We conclude that our analysis provides strong evidence for the superiority of normalized spectral clustering in practical applications. We believe that methods used in our analysis will provide a basis for future exploration of Laplacianbased methods in a statistical setting.
MULTIVARIATE GARCH MODELS: A SURVEY
"... This paper surveys the most important developments in multivariate ARCHtype modelling. It reviews the model specifications and inference methods, and identifies likely directions of future research. ..."
Abstract

Cited by 116 (7 self)
 Add to MetaCart
This paper surveys the most important developments in multivariate ARCHtype modelling. It reviews the model specifications and inference methods, and identifies likely directions of future research.
Antenna Selection for Spatial Multiplexing Systems Based on Minimum Error Rate
"... Future cellular systems will employ spatial multiplexing with multiple antennas at both transmitter and receiver to take advantage of large capacity gains. In such systems it will be desirable to select a subset of available transmit antennas for link initialization, link maintenance, or handoff. In ..."
Abstract

Cited by 113 (11 self)
 Add to MetaCart
Future cellular systems will employ spatial multiplexing with multiple antennas at both transmitter and receiver to take advantage of large capacity gains. In such systems it will be desirable to select a subset of available transmit antennas for link initialization, link maintenance, or handoff. In this paper we present a criteria for selecting the optimal antenna subset in terms of minimum error rate, when coherent receivers, either linear or maximum likelihood (ML), are used over a slowly varying channel. For the ML receiver we propose to pick the subset whose output constellation has the largest minimum Euclidean distance. For the linear receiver we propose use of the postprocessing SNRs (signal to noise ratios) of the multiplexed streams whereby the antenna subset that induces the largest minimum SNR is chosen. Simulations demonstrate that our selection algorithms also provides diversity advantage thus making subset selection useful over fading channels. I.
Generalized linear precoder and decoder design for MIMO channels using the weighted MMSE criterion
 IEEE Transactions on Communications
, 2001
"... ..."
Recovery of exact sparse representations in the presence of bounded noise
 IEEE Trans. on I.T
, 2005
"... The purpose of this contribution is to extend some recent results on sparse representations of signals in redundant bases developed in the noisefree case to the case of noisy observations. The type of questions addressed so far is: given a (n,m)matrix with and a vector, find a sufficient condition ..."
Abstract

Cited by 81 (5 self)
 Add to MetaCart
The purpose of this contribution is to extend some recent results on sparse representations of signals in redundant bases developed in the noisefree case to the case of noisy observations. The type of questions addressed so far is: given a (n,m)matrix with and a vector, find a sufficient condition for to have an unique sparsest representation as a linear combination of the columns of. The answer is a bound on the number of nonzero entries of say, that guaranties that is the unique and sparsest solution of with. We consider the case where satisfies the sparsity conditions requested in the noisefree case and seek conditions on, a vector of additive noise or modeling errors, under which can be recovered from in a sense to be defined. 1.
Automatic Person Verification Using Speech and Face Information
, 2003
"... Identity verification systems are an important part of our every day life. A typical example is the Automatic Teller Machine (ATM) which employs a simple identity verification scheme: the user is asked to enter their secret password after inserting their ATM card; if the password matches the one pre ..."
Abstract

Cited by 30 (7 self)
 Add to MetaCart
Identity verification systems are an important part of our every day life. A typical example is the Automatic Teller Machine (ATM) which employs a simple identity verification scheme: the user is asked to enter their secret password after inserting their ATM card; if the password matches the one prescribed to the card, the user is allowed access to their bank account. This scheme suffers from a major drawback: only the validity of the combination of a certain possession (the ATM card) and certain knowledge (the password) is verified. The ATM card can be lost or stolen, and the password can be compromised. Thus new verification methods have emerged, where the password has either been replaced by, or used in addition to, biometrics such as the person's speech, face image or fingerprints. Apart from the ATM example described above, biometrics can be applied to other areas, such as telephone & internet based banking, airline reservations & checkin, as well as forensic work and law enforcement applications. Biometric systems
Complexity of SemiAlgebraic Proofs
, 2001
"... It is a known approach to translate propositional formulas into systems of polynomial inequalities and to consider proof systems for the latter ones. The wellstudied proof systems of this kind are the Cutting Planes proof system (CP) utilizing linear inequalities and the LovaszSchrijver calculi ..."
Abstract

Cited by 24 (2 self)
 Add to MetaCart
It is a known approach to translate propositional formulas into systems of polynomial inequalities and to consider proof systems for the latter ones. The wellstudied proof systems of this kind are the Cutting Planes proof system (CP) utilizing linear inequalities and the LovaszSchrijver calculi (LS) utilizing quadratic inequalities. We introduce generalizations LS^d of LS that operate with polynomial inequalities of degree at most d. It turns out
A Fast Algorithm for Joint Diagonalization with Nonorthogonal Transformations and its Application to Blind Source Separation
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2004
"... A new efficient algorithm is presented for joint diagonalization of several matrices. The algorithm is based on the Frobeniusnorm formulation of the joint diagonalization problem, and addresses diagonalization with a general, nonorthogonal transformation. The iterative scheme of the algorithm i ..."
Abstract

Cited by 22 (3 self)
 Add to MetaCart
A new efficient algorithm is presented for joint diagonalization of several matrices. The algorithm is based on the Frobeniusnorm formulation of the joint diagonalization problem, and addresses diagonalization with a general, nonorthogonal transformation. The iterative scheme of the algorithm is based on a multiplicative update which ensures the invertibility of the diagonalizer. The algorithm 's efficiency stems from the special approximation of the cost function resulting in a sparse, blockdiagonal Hessian to be used in the computation of the quasiNewton update step. Extensive numerical simulations illustrate the performance of the algorithm and provide a comparison to other leading diagonalization methods. The results of such comparison demonstrate that the proposed algorithm is a viable alternative to existing stateoftheart joint diagonalization algorithms.
Simplifying mixture models through function approximation
 IEEE Transactions on Neural Networks
, 2010
"... The finite mixture model is widely used in various statistical learning problems. However, the model obtained may contain a large number of components, making it inefficient in practical applications. In this paper, we propose to simplify the mixture model by first grouping similar components togeth ..."
Abstract

Cited by 14 (2 self)
 Add to MetaCart
The finite mixture model is widely used in various statistical learning problems. However, the model obtained may contain a large number of components, making it inefficient in practical applications. In this paper, we propose to simplify the mixture model by first grouping similar components together and then performing local fitting through function approximation. By using the squared loss to measure the distance between mixture models, our algorithm naturally combines the two different tasks of component clustering and model simplification. The proposed method can be used to speed up various algorithms that use mixture models during training (e.g., Bayesian filtering, belief propagation) or testing (e.g., kernel density estimation, SVM testing). Encouraging results are observed in the experiments on density estimation, clusteringbased image segmentation and simplification of SVM decision functions.