Results 11  20
of
58
Evolutionary multiobjective optimization for generating an ensemble of fuzzy rulebased classifiers
 in Genetic and Evolutionary Computation (GECCO
"... Abstract. One advantage of evolutionary multiobjective optimization (EMO) algorithms over classical approaches is that many nondominated solutions can be simultaneously obtained by their single run. In this paper, we propose an idea of using EMO algorithms for constructing an ensemble of fuzzy rule ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
(Show Context)
Abstract. One advantage of evolutionary multiobjective optimization (EMO) algorithms over classical approaches is that many nondominated solutions can be simultaneously obtained by their single run. In this paper, we propose an idea of using EMO algorithms for constructing an ensemble of fuzzy rulebased classifiers with high diversity. The classification of new patterns is performed based on the vote of multiple classifiers generated by a single run of EMO algorithms. Even when the classification performance of individual classifiers is not high, their ensemble often works well. The point is to generate multiple classifiers with high diversity. We demonstrate the ability of EMO algorithms to generate various nondominated fuzzy rulebased classifiers with high diversity by their single run. Through computational experiments on some wellknown benchmark data sets, it is shown that the vote of generated fuzzy rulebased classifiers leads to high classification performance on test patterns. 1
Decisiontree instancespace decomposition with grouped gainratio
 Information Sciences 177
, 2007
"... This paper examines a decisiontree framework for instancespace decomposition. According to the framework, the original instancespace is hierarchically partitioned into multiple subspaces and a distinct classifier is assigned to each subspace. Subsequently, an unlabeled, previouslyunseen instance ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
(Show Context)
This paper examines a decisiontree framework for instancespace decomposition. According to the framework, the original instancespace is hierarchically partitioned into multiple subspaces and a distinct classifier is assigned to each subspace. Subsequently, an unlabeled, previouslyunseen instance is classified by employing the classifier that was assigned to the subspace to which the instance belongs. After describing the framework, the paper suggests a novel splittingrule for the framework and presents an experimental study, which was conducted, to compare various implementations of the framework. The study indicates that using the novel splittingrule, previously presented implementations of the framework, can be improved in terms of accuracy and computation time.
Generating Classifier Committees by Stochastically Selecting both Attributes and Training Examples
 Proceedings 5th Pacific Rim International Conferences on Artificial Intelligence (PRICAI’98
, 1998
"... . Boosting and Bagging, as two representative approaches to learning classifier committees, have demonstrated great success, especially for decision tree learning. They repeatedly build different classifiers using a base learning algorithm by changing the distribution of the training set. Sasc, as a ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
. Boosting and Bagging, as two representative approaches to learning classifier committees, have demonstrated great success, especially for decision tree learning. They repeatedly build different classifiers using a base learning algorithm by changing the distribution of the training set. Sasc, as a different type of committee learning method, can also significantly reduce the error rate of decision trees. It generates classifier committees by stochastically modifying the set of attributes but keeping the distribution of the training set unchanged. It has been shown that Bagging and Sasc are, on average, less accurate than Boosting, but the performance of the former is more stable than that of the latter in terms of less frequently obtaining significantly higher error rates than the base learning algorithm. In this paper, we propose a novel committee learning algorithm, called SascBag, that combines Sasc and Bagging. It creates different classifiers by stochastically varying both the a...
Detection and Prognostics on Low Dimensional Systems
"... This paper describes new algorithms for prognostics and anomaly detection on systems that can be described by low dimensional, potentially nonlinear dynamics. The methodology relies on estimating the conditional probability distribution of the output of the system at a future time given knowledge o ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
This paper describes new algorithms for prognostics and anomaly detection on systems that can be described by low dimensional, potentially nonlinear dynamics. The methodology relies on estimating the conditional probability distribution of the output of the system at a future time given knowledge of the current state of the system. These conditional probabilities are estimated using a variety of techniques, including bagged neural networks and kernel methods such as Gaussian Process Regression (GPR) and compare the results against standard methods such as linear autoregressive models and the nearest neighbor algorithm. We demonstrate the algorithms on a realworld data set and a simulated data set. The realworld data set consists of the intensity of an NH3 laser. The laser data set has been shown by other authors to exhibit lowdimensional chaos with significant drops in intensity. The simulated data set is generated from the Lorenz attractor and has completely known statistical characteristics. On these data sets, we show the evolution of the estimated conditional probability distribution, the way it can act as a prognostic signal, and its use as an early warning system. We also review a novel approach to perform Gaussian Process Regression with large numbers of data points.
H.: Fusion of Self Organizing Maps
 IWANN 2007. LNCS
, 2007
"... Abstract. An important issue in datamining is to find effective and optimal forms to learn and preserve the topological relations of highly dimensional input spaces and project the data to lower dimensions for visualization purposes. In this paper we propose a novel ensemble method to combine a fin ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Abstract. An important issue in datamining is to find effective and optimal forms to learn and preserve the topological relations of highly dimensional input spaces and project the data to lower dimensions for visualization purposes. In this paper we propose a novel ensemble method to combine a finite number of Self Organizing Maps, we called this model FusionSOM. In the fusion process the nodes with similar Voronoi polygons are merged in one fused node and the neighborhood relation is given by links that measures the similarity between these fused nodes. The aim of combining the SOM is to improve the quality and robustness of the topological representation of the single model. Computational experiments show that the FusionSOM model effectively preserves the topology of the input space and improves the representation of the single SOM. We report the performance results using synthetic and real datasets, the latter obtained from a benchmark site.
Maximum Feasibility Approach for Consensus Classifiers: Applications to Protein Structure Prediction
"... A novel strategy to optimize consensus classifiers for large classification problems is proposed, based on Linear Programming (LP) techniques and the recently introduced Maximum Feasibility (MaxF) heuristic for solving infeasible LP problems. For a set of classifiers and their normalized class depen ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
A novel strategy to optimize consensus classifiers for large classification problems is proposed, based on Linear Programming (LP) techniques and the recently introduced Maximum Feasibility (MaxF) heuristic for solving infeasible LP problems. For a set of classifiers and their normalized class dependent scores one postulates that the consensus score is a linear combination of individual scores. We require this consensus score to satisfy a set of linear constraints, imposing that the consensus score for the true class be higher than for any other classes.. Additional constraints may be added in order to impose that the margin of separation (difference between the true class score and false classes scores) for the consensus classifier be larger than that of the best individual classifier. Since LP problems defined this way are typically infeasible, approximate solutions with good generalization properties are found using interior point methods for LP in conjunction with the MaxF heuristic. The new technique has been applied to a number of classification problems relevant for protein structure prediction. 1.
Multiple Boosting: A Combination of Boosting and Bagging
 Proceedings of the 1998 International Conference on Parallel and Distributed Processing Techniques and Applications
, 1998
"... Classifier committee learning approaches have demonstrated great success in increasing the prediction accuracy of classifier learning, which is a key technique for datamining. It has been shown that Boosting and Bagging, as two representative methods of this type, can significantly decrease the erro ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Classifier committee learning approaches have demonstrated great success in increasing the prediction accuracy of classifier learning, which is a key technique for datamining. It has been shown that Boosting and Bagging, as two representative methods of this type, can significantly decrease the error rate of decision tree learning. Boosting is generally more accurate than Bagging, but the former is more variable than the latter. In addition, Bagging is amenable to parallel or distributed processing, while Boosting is not. In this paper, we study a new committee learning algorithm, namely MB (Multiple Boosting). It creates multiple subcommittees by combining Boosting and Bagging. Experimental results in a representative collection of natural domains show that MB is, on average, more accurate than either Bagging or Boosting alone. It is more stable than Boosting, and is amenable to parallel or distributed processing. These characteristics make MB a good choice for parallel datamining. K...
Improving on Bagging with Input Smearing
"... Abstract. Bagging is an ensemble learning method that has proved to be a useful tool in the arsenal of machine learning practitioners. Commonly applied in conjunction with decision tree learners to build an ensemble of decision trees, it often leads to reduced errors in the predictions when compared ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract. Bagging is an ensemble learning method that has proved to be a useful tool in the arsenal of machine learning practitioners. Commonly applied in conjunction with decision tree learners to build an ensemble of decision trees, it often leads to reduced errors in the predictions when compared to using a single tree. A single tree is built from a training set of size N. Bagging is based on the idea that, ideally, we would like to eliminate the variance due to a particular training set by combining trees built from all training sets of size N. However, in practice, only one training set is available, and bagging simulates this platonic method by sampling with replacement from the original training data to form new training sets. In this paper we pursue the idea of sampling from a kernel density estimator of the underlying distribution to form new training sets, in addition to sampling from the data itself. This can be viewed as “smearing out ” the resampled training data to generate new datasets, and the amount of “smear ” is controlled by a parameter. We show that the resulting method, called “input smearing”, can lead to improved results when compared to bagging. We present results for both classification and regression problems. 1
Dynamic Coefficients in Neural Network Regression Ensembles
, 1998
"... Keywords: Machine Learning, Ensemble, Neural Network, Decomposition, Dynamic Coefficients. A lot of work has been conducted in the field of ensemble methods during the past five years. It has been shown that combining a group of predictors can dramatically reduce generalization error by reducing the ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Keywords: Machine Learning, Ensemble, Neural Network, Decomposition, Dynamic Coefficients. A lot of work has been conducted in the field of ensemble methods during the past five years. It has been shown that combining a group of predictors can dramatically reduce generalization error by reducing the variance without increasing the bias. Most of the ensemble methods, e.g. Bagging and AdaBoost use some kind of resampling of the training set to achieve the desired effect. I propose a new kind of ensemble method for regressors, where no resampling is done explicit, but is done implicit by letting the coefficients of the individual members of an ensemble be functions of the input, and train them together with the members of the ensemble. This method is compared to Bagging and single neural networks. The possibility of the coefficients to decompose the input in a reasonable way is investigated. 1 1 Introduction A number of methods using a group of predictors to achieve improved learning h...
Identifying keystone species in the human gut microbiome from metagenomic timeseries using sparse linear regression. Available online: http://arxiv.org/pdf/1402.0511v1.pdf (accessed on 11
, 2014
"... Human associated microbial communities exert tremendous influence over human health and disease. With modern metagenomic sequencing methods it is now possible to follow the relative abundance of microbes in a community over time. These microbial communities exhibit rich ecological dynamics and an im ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Human associated microbial communities exert tremendous influence over human health and disease. With modern metagenomic sequencing methods it is now possible to follow the relative abundance of microbes in a community over time. These microbial communities exhibit rich ecological dynamics and an important goal of microbial ecology is to infer the ecological interactions between species directly from sequence data. Any algorithm for inferring ecological interactions must overcome three major obstacles: 1) a correlation between the abundances of two species does not imply that those species are interacting, 2) the sum constraint on the relative abundances obtained from metagenomic studies makes it difficult to infer the parameters in timeseries models, and 3) errors due to experimental uncertainty, or misassignment of sequencing reads into operational taxonomic units, bias inferences of species interactions due to a statistical problem called ‘‘errorsinvariables’’. Here we introduce an approach, Learning Interactions from MIcrobial Time Series (LIMITS), that overcomes these obstacles. LIMITS uses sparse linear regression with boostrap aggregation to infer a discretetime LotkaVolterra model for microbial dynamics. We tested LIMITS on synthetic data and showed that it could reliably infer the topology of the interspecies ecological interactions. We then used LIMITS to characterize the species interactions in the gut microbiomes of two individuals and found that the interaction networks varied significantly between individuals. Furthermore, we found that the interaction networks of the two individuals are dominated by distinct ‘‘keystone species’’,