Results 1 - 10
of
15
Online Bayes Point Machines
"... We present a new and simple algorithm for learning large margin classi ers that works in a truly online manner. The algorithm generates a linear classi er by averaging the weights associated with several perceptron-like algorithms run in parallel in order to approximate the Bayes point. A rand ..."
Abstract
-
Cited by 55 (2 self)
- Add to MetaCart
We present a new and simple algorithm for learning large margin classi ers that works in a truly online manner. The algorithm generates a linear classi er by averaging the weights associated with several perceptron-like algorithms run in parallel in order to approximate the Bayes point. A random subsample of the incoming data stream is used to ensure diversity in the perceptron solutions. We experimentally study the algorithm's performance on online and batch learning settings.
Supervised Neural Gas with General Similarity Measure
- Neural Processing Letters
, 2003
"... Prototype based classi cation oers intuitive and sparse models with excellent generalization ability. However, these models usually crucially depend on the underlying Euclidian metric; moreover, online variants likely suer from the problem of local optima. We here propose a generalization of learni ..."
Abstract
-
Cited by 24 (20 self)
- Add to MetaCart
Prototype based classi cation oers intuitive and sparse models with excellent generalization ability. However, these models usually crucially depend on the underlying Euclidian metric; moreover, online variants likely suer from the problem of local optima. We here propose a generalization of learning vector quantization with three additional features: (I) it directly integrates neighborhood cooperation, hence is less aected by local optima; (II) the method can be combined with any dierentiable similarity measure whereby metric parameters such as relevance factors of the input dimensions can automatically be adapted according to the given data; (III) it obeys a gradient dynamics hence shows very robust behavior, and the chosen objective is related to margin optimization.
Prototype Based Recognition of Splice Sites
- Bioinformatics using Computational Intelligence Paradigms
"... Introduction Rapid advances in biotechnology have made massive amounts of biological data available so that automated analyzing tools constitute a prerequisite to cope with huge and complex biological sequence data. Machine learning tools are used for widespread applications ranging from the iden ..."
Abstract
-
Cited by 7 (5 self)
- Add to MetaCart
Introduction Rapid advances in biotechnology have made massive amounts of biological data available so that automated analyzing tools constitute a prerequisite to cope with huge and complex biological sequence data. Machine learning tools are used for widespread applications ranging from the identification of characteristic functional sites in genomic DNA [39], the prediction of protein secondary structure and higher structures [53], to the classification of the functionality of chemical compounds [5]. Here we will deal with a subproblem in de novo gene finding in DNA sequences of a given species, the problem of splice site recognition. For higher eukaryotic mechanisms gene finding requires the identification of the start and stop codons and the recognition of all introns, i.e. non-coding regions which are spliced out before transcription, that means all donor and acceptor sites of the sequence. The biological splicing process is only partially understood [64]. Fig. 1 depicts a sc
Neural Methods for Non-Standard Data
- proceedings of the 12 th European Symposium on Artificial Neural Networks (ESANN 2004), d-side pub
, 2004
"... Standard pattern recognition provides effective and noise-tolerant tools for machine learning tasks; however, most approaches only deal with real vectors of a finite and fixed dimensionality. In this tutorial paper, we give an overview about extensions of pattern recognition towards non-standard ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
Standard pattern recognition provides effective and noise-tolerant tools for machine learning tasks; however, most approaches only deal with real vectors of a finite and fixed dimensionality. In this tutorial paper, we give an overview about extensions of pattern recognition towards non-standard data which are not contained in a finite dimensional space, such as strings, sequences, trees, graphs, or functions. Two major directions can be distinguished in the neural networks literature: models can be based on a similarity measure adapted to non-standard data, including kernel methods for structures as a very prominent approach, but also alternative metric based algorithms and functional networks; alternatively, non-standard data can be processed recursively within supervised and unsupervised recurrent and recursive networks and fully recurrent systems.
Clustering with the Fisher Score
- Advances in Neural Information Processing Systems 15
, 2003
"... Recently the Fisher score (or the Fisher kernel) is increasingly used as a feature extractor for classification problems. The Fisher score is a vector of parameter derivatives of loglikelihood of a probabilistic model. This paper gives a theoretical analysis about how class information is preserv ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Recently the Fisher score (or the Fisher kernel) is increasingly used as a feature extractor for classification problems. The Fisher score is a vector of parameter derivatives of loglikelihood of a probabilistic model. This paper gives a theoretical analysis about how class information is preserved in the space of the Fisher score, which turns out that the Fisher score consists of a few important dimensions with class information and many nuisance dimensions. When we perform clustering with the Fisher score, K-Means type methods are obviously inappropriate because they make use of all dimensions. So we will develop a novel but simple clustering algorithm specialized for the Fisher score, which can exploit important dimensions. This algorithm is successfully tested in experiments with artificial data and real data (amino acid sequences).
Prototype based Machine Learning for Clinical Proteomics
, 2006
"... Clinical proteomics opens the way towards new insights into many diseases on a level of detail not available before. One of the most promising measurement techniques supporting this approach is mass spectrometry based clinical proteomics. The analysis of the high dimensional data obtained from mass ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Clinical proteomics opens the way towards new insights into many diseases on a level of detail not available before. One of the most promising measurement techniques supporting this approach is mass spectrometry based clinical proteomics. The analysis of the high dimensional data obtained from mass spectrometry asks for sophisticated, problem adequate preprocessing and data analysis approaches. Ideally, automatic analysis tools provide insight into their behavior and the ability to extract further information, relevant for an understanding of the clinical data or applications such as biomarker discovery. Prototype based algorithms constitute efficient, intuitive and powerful machine learning methods which are very well suited to deal with high dimensional data and which allow good insight into their behavior by means of prototypical data locations. They have already successfully been applied to various problems in bioinformatics. The goal of this thesis is to extend prototype based methods, in such a way that they become suitable machine learning tools for typical problems in clinical proteomics. To achieve better adapted classification borders, tailored to the specific data distributions
Asymptotic Properties of the Fisher Kernel
- Neural Computation
, 2003
"... This paper analyses the Fisher kernel (FK) from a statistical point of view. The FK is a particularly interesting method for constructing a model of the posterior probability that makes intelligent use of unlabeled data, i.e. of the underlying data density. It is important to analyse and ultimate ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
This paper analyses the Fisher kernel (FK) from a statistical point of view. The FK is a particularly interesting method for constructing a model of the posterior probability that makes intelligent use of unlabeled data, i.e. of the underlying data density. It is important to analyse and ultimately understand the statistical properties of the FK. To this end, we first establish su#cient conditions that the constructed posterior model is realizable, i.e. that it contains the true distribution.
Self-Organizing Maps for Time Series
, 2005
"... We review a recent extension of the self-organizing map (SOM) for temporal structures with a simple recurrent dynamics leading to sparse representations, which allows an efficient training and a combination with arbitrary lattice structures. We discuss its practical applicability and its theoretical ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
We review a recent extension of the self-organizing map (SOM) for temporal structures with a simple recurrent dynamics leading to sparse representations, which allows an efficient training and a combination with arbitrary lattice structures. We discuss its practical applicability and its theoretical properties. Afterwards, we put the approach into a general framework of recurrent unsupervised models. This generic formulation also covers a variety of well-known alternative approaches including the temporal Kohonen map, the recursive SOM, and SOM for structured data. Based on this formulation, mathematical properties of the models are investigated. Interestingly, the dynamic can be generalized from sequences to more general tree structures thus opening the way to unsupervised processing of general data structures.
Support Vector Machine Approach for Cancer Detection Using Amplified Fragment Length Polymorphism (AFLP) Screen Method
, 2004
"... Support Vector Machine is used to classify data obtained from Amplified Fragment length Polymorphism screening of gastric cancer and normal tissue samples. Using the electrophoresis peak intensity measurements of the amplified fragments of the cancer and normal tissues, SVM was able to distinguish g ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Support Vector Machine is used to classify data obtained from Amplified Fragment length Polymorphism screening of gastric cancer and normal tissue samples. Using the electrophoresis peak intensity measurements of the amplified fragments of the cancer and normal tissues, SVM was able to distinguish gastric cancer from normal tissue samples with a sensitivity of 0.98 and specificity of 0.75. As AFLP is a low cost procedure which requires minimum prior sequence knowledge and biological material, SVM prediction of AFLP screening data is a potential tool for gastric cancer screening and diagnosis .

