Results 1  10
of
28
Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge Univ
, 1998
"... by ..."
Comparison of support vector machine and artificial neural network systems for drug/ nNondrug classification
 J. Chem. Inf. Comput. Sci. 2003
"... Support vector machine (SVM) and artificial neural network (ANN) systems were applied to a drug/nondrug classification problem as an example of binary decision problems in earlyphase virtual compound filtering and screening. The results indicate that solutions obtained by SVM training seem to be mo ..."
Abstract

Cited by 32 (1 self)
 Add to MetaCart
Support vector machine (SVM) and artificial neural network (ANN) systems were applied to a drug/nondrug classification problem as an example of binary decision problems in earlyphase virtual compound filtering and screening. The results indicate that solutions obtained by SVM training seem to be more robust with a smaller standard error compared to ANN training. Generally, the SVM classifier yielded slightly higher prediction accuracy than ANN, irrespective of the type of descriptors used for molecule encoding, the size of the training data sets, and the algorithm employed for neural network training. The performance was compared using various different descriptor sets and descriptor combinations based on the 120 standard GhoseCrippen fragment descriptors, a wide range of 180 different properties and physicochemical descriptors from the Molecular Operating Environment (MOE) package, and 225 topological pharmacophore (CATS) descriptors. For the complete set of 525 descriptors crossvalidated classification by SVM yielded 82% correct predictions (Matthews cc) 0.63), whereas ANN reached 80 % correct predictions (Matthews cc) 0.58). Although SVM outperformed the ANN classifiers with regard to overall prediction accuracy, both methods were shown to complement each other, as the sets of true positives, false positives (overprediction), true negatives, and false negatives (underprediction) produced by the two classifiers were not identical. The theory of SVM and ANN training is briefly reviewed.
Local Regularization Assisted Orthogonal Least Squares Regression
 IEEE Transactions on Neural Networks, submitted
, 2001
"... A locally regularized orthogonal least squares (LROLS) algorithm is proposed for constructing parsimonious or sparse regression models that generalize well. By associating each orthogonal weight in the regression model with an individual regularization parameter, the ability for the orthogonal least ..."
Abstract

Cited by 23 (5 self)
 Add to MetaCart
A locally regularized orthogonal least squares (LROLS) algorithm is proposed for constructing parsimonious or sparse regression models that generalize well. By associating each orthogonal weight in the regression model with an individual regularization parameter, the ability for the orthogonal least squares (OLS) model selection to produce a very sparse model with good generalization performance is greatly enhanced. Furthermore, with the assistance of local regularization, when to terminate the model selection procedure becomes much clearer. This LROLS algorithm has computational advantages over the recently introduced relevance vector machine (RVM) method. Keywords Orthogonal least squares algorithm, regularization, regression, support vector machines, relevance vector machines. I.
Dynamic Models for Nonstationary Signal Segmentation
, 1998
"... This paper investigates Hidden Markov Models (HMMs) in which the observations are generated from an autoregressive (AR) model. The overall model performs nonstationary spectral analysis and automatically segments a time series into discrete dynamic regimes. Because learning in HMMs is sensitive to i ..."
Abstract

Cited by 17 (4 self)
 Add to MetaCart
This paper investigates Hidden Markov Models (HMMs) in which the observations are generated from an autoregressive (AR) model. The overall model performs nonstationary spectral analysis and automatically segments a time series into discrete dynamic regimes. Because learning in HMMs is sensitive to initial conditions, we initialise the HMM model with parameters derived from a cluster analysis of Kalman filter coefficients. An important aspect of the Kalman filter implementation is that the state noise is estimated online. This allows for an initial estimation of AR parameters for each of the different dynamic regimes. These estimates are then finetuned with the HMM model. The method is demonstrated on a number of synthetic problems and on electroencephalogram (EEG) data. 1 Introduction Autoregressive (AR) models are a wellknown parametric technique for the spectral estimation of stationary signals [1]. The standard AR model can also be used for the spectral estimation of nonstationa...
On Deformable Models for Visual Pattern Recognition
, 2002
"... This paper reviews modelbased methods fornonrig# shape recogLj#If8 These methods model, match andclassif nonrigg shapes, which aregefIxq#x problematic for conventationalalgentati using rigg models. ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
This paper reviews modelbased methods fornonrig# shape recogLj#If8 These methods model, match andclassif nonrigg shapes, which aregefIxq#x problematic for conventationalalgentati using rigg models.
Gossipbased computation of a Gaussian mixture model for distributed multimedia indexing
 IEEE Transactions on Multimedia
, 2008
"... Abstract — The present paper deals with pattern recognition in a distributed computing context of the peertopeer type, that should be more and more interesting for multimedia data indexing and retrieval. Our goal is estimating of classconditional probability densities, that take the form of Gaussi ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
Abstract — The present paper deals with pattern recognition in a distributed computing context of the peertopeer type, that should be more and more interesting for multimedia data indexing and retrieval. Our goal is estimating of classconditional probability densities, that take the form of Gaussian mixture models (GMM). Originally, we propagate GMMs in a decentralized fashion (gossip) in a network, and aggregate GMMs from various sources, through a technique that only involves little computation and that makes parcimonious usage of the network resource, as model parameters rather than data are transmitted. The aggregation is based on iterative optimization of an approximation of a KL divergence allowing closedform computation between mixture models. Experimental results demonstrate the scheme to the case of speaker recognition. I.
Neural Network Predictions With Error Bars
 Department of Electrical and Electronic Engineering
, 1997
"... When a neural network makes a prediction it will have an error that can be decomposed into the six following sources: (1a) model bias from data, (1b) model bias from training, (2a) model variance from data, (2b) model variance from training, (3) target noise, (4) input noise. We discuss methods for ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
When a neural network makes a prediction it will have an error that can be decomposed into the six following sources: (1a) model bias from data, (1b) model bias from training, (2a) model variance from data, (2b) model variance from training, (3) target noise, (4) input noise. We discuss methods for estimating each of these components of error in linear and nonlinear neural networks. In the nonlinear case, the prediction errors are estimated by using a combination of methods; the 'delta method' to capture model variance from data, a committee of networks to capture the model variance from training and an auxiliary network to capture the target noise. This is motivated by considering each neural net prediction to be a 'mixture' of predictions from different networks where the output of each network is modelled as a Gaussian. 1 Introduction In this paper we investigate methods for estimating the errors associated with predictions from neural networks. Error estimates are useful as they a...
Classifiers Fusion With Data Dependent Aggregation Schemes
, 2001
"... In this paper we studied two different classifiers fusion algorithms exploiting the combination weights expressed over the entire data space and the combination with data dependent weights. The following aggregation schemes are employed in the study: the majority vote, the averaging, the combination ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
In this paper we studied two different classifiers fusion algorithms exploiting the combination weights expressed over the entire data space and the combination with data dependent weights. The following aggregation schemes are employed in the study: the majority vote, the averaging, the combination via Choquet integral with the Lamdafuzzy measure, the combination via space partitioning and classifier selection approach, and the combination via Choquet integral with the data dependent Lamdafuzzy measure. 1.
Improved LongTerm Temperature Prediction for a Furnace Installation Using Neural Networks
"...  When articial neural networks (ANNs) are trained to predict signals p steps ahead, the quality of the prediction typically decreases for large values of p. In this paper, we compare two methods for prediction with ANNs: the classical recursion of onestep ahead predictors and a new kind of chain s ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
 When articial neural networks (ANNs) are trained to predict signals p steps ahead, the quality of the prediction typically decreases for large values of p. In this paper, we compare two methods for prediction with ANNs: the classical recursion of onestep ahead predictors and a new kind of chain structure. When applying this technique to the prediction of an industrial dataset, we conclude that this last approach leads to an improved prediction of the temperature, since the chained networks gradually take the prediction of their predecessor in the chain as an extra input. Keywords Long term prediction, neural networks, industrial prediction, Bayesian regularization. I. Introduction In this paper, we want to compare two methods for longterm prediction of a temperature signal using neural networks [1, 4]. For an industrial problem under study, one sometimes notices that when a pstep ahead predicting neural network is trained to predict signals p steps ahead in one stage, the qual...
Collaborative filtering with interlaced generalized linear models
, 2008
"... Collaborative filtering (CF) is a data analysis task appearing in many challenging applications, in particular data mining in Internet and ecommerce. CF can often be formulated as identifying patterns in a large and mostly empty rating matrix. In this paper, we focus on predicting unobserved rating ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Collaborative filtering (CF) is a data analysis task appearing in many challenging applications, in particular data mining in Internet and ecommerce. CF can often be formulated as identifying patterns in a large and mostly empty rating matrix. In this paper, we focus on predicting unobserved ratings. This task is often a part of a recommendation procedure. We propose a new CF approach called interlaced generalized linear models (GLM); it is based on a factorization of the rating matrix and uses probabilistic modeling to represent uncertainty in the ratings. The advantage of this approach is that different configurations, encoding different intuitions about the rating process can easily be tested while keeping the same learning procedure. The GLM formulation is the keystone to derive an efficient learning procedure, applicable to large datasets. We illustrate the technique on three public domain datasets. r 2008 Elsevier B.V. All rights reserved.