Results 11 - 20
of
23
Generating Classifier Committees by Stochastically Selecting both Attributes and Training Examples
- Proceedings 5th Pacific Rim International Conferences on Artificial Intelligence (PRI-CAI’98
, 1998
"... . Boosting and Bagging, as two representative approaches to learning classifier committees, have demonstrated great success, especially for decision tree learning. They repeatedly build different classifiers using a base learning algorithm by changing the distribution of the training set. Sasc, as a ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
. Boosting and Bagging, as two representative approaches to learning classifier committees, have demonstrated great success, especially for decision tree learning. They repeatedly build different classifiers using a base learning algorithm by changing the distribution of the training set. Sasc, as a different type of committee learning method, can also significantly reduce the error rate of decision trees. It generates classifier committees by stochastically modifying the set of attributes but keeping the distribution of the training set unchanged. It has been shown that Bagging and Sasc are, on average, less accurate than Boosting, but the performance of the former is more stable than that of the latter in terms of less frequently obtaining significantly higher error rates than the base learning algorithm. In this paper, we propose a novel committee learning algorithm, called SascBag, that combines Sasc and Bagging. It creates different classifiers by stochastically varying both the a...
Multiple Boosting: A Combination of Boosting and Bagging
- Proceedings of the 1998 International Conference on Parallel and Distributed Processing Techniques and Applications
, 1998
"... Classifier committee learning approaches have demonstrated great success in increasing the prediction accuracy of classifier learning, which is a key technique for datamining. It has been shown that Boosting and Bagging, as two representative methods of this type, can significantly decrease the erro ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Classifier committee learning approaches have demonstrated great success in increasing the prediction accuracy of classifier learning, which is a key technique for datamining. It has been shown that Boosting and Bagging, as two representative methods of this type, can significantly decrease the error rate of decision tree learning. Boosting is generally more accurate than Bagging, but the former is more variable than the latter. In addition, Bagging is amenable to parallel or distributed processing, while Boosting is not. In this paper, we study a new committee learning algorithm, namely MB (Multiple Boosting). It creates multiple subcommittees by combining Boosting and Bagging. Experimental results in a representative collection of natural domains show that MB is, on average, more accurate than either Bagging or Boosting alone. It is more stable than Boosting, and is amenable to parallel or distributed processing. These characteristics make MB a good choice for parallel datamining. K...
Decision-tree instance-space decomposition with grouped gain-ratio
- Information Sciences 177
, 2007
"... This paper examines a decision-tree framework for instance-space decomposition. According to the framework, the original instance-space is hierarchically partitioned into multiple subspaces and a distinct classifier is assigned to each subspace. Subsequently, an unlabeled, previously-unseen instance ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
This paper examines a decision-tree framework for instance-space decomposition. According to the framework, the original instance-space is hierarchically partitioned into multiple subspaces and a distinct classifier is assigned to each subspace. Subsequently, an unlabeled, previously-unseen instance is classified by employing the classifier that was assigned to the subspace to which the instance belongs. After describing the framework, the paper suggests a novel splitting-rule for the framework and presents an experimental study, which was conducted, to compare various implementations of the framework. The study indicates that using the novel splitting-rule, previously presented implementations of the framework, can be improved in terms of accuracy and computation time.
Neural Networks from Similarity Based Perspective
- In: New Frontiers in Computational Intelligence and its Applications. Ed. M. Mohammadian, IOS
, 2000
"... A framework for Similarity-Based Methods (SBMs) includes many neural network models as special cases. Multilayer Perceptrons (MLPs) use scalar products to compute weighted activation of neurons, combining soft hyperplanes to provide decision borders. Scalar product is replaced by a distance function ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
A framework for Similarity-Based Methods (SBMs) includes many neural network models as special cases. Multilayer Perceptrons (MLPs) use scalar products to compute weighted activation of neurons, combining soft hyperplanes to provide decision borders. Scalar product is replaced by a distance function between the inputs and the weights, offering a natural generalization of the standard MLP model to the distance-based multilayer perceptron (D-MLP) model. D-MLPs evaluate similarity of inputs to weights making the interpretation of their mappings easier. Cluster-based initialization procedure determining architecture and values of all adaptive parameters is described. D-MLP networks are useful not only for classification and approximation, but also as associative memories, in problems requiring pattern completion, offering an efficient way to deal with missing values. Non-Euclidean distance functions may also be introduced by normalization of the input vectors in an extended fe...
Dynamic Coefficients in Neural Network Regression Ensembles
, 1998
"... Keywords: Machine Learning, Ensemble, Neural Network, Decomposition, Dynamic Coefficients. A lot of work has been conducted in the field of ensemble methods during the past five years. It has been shown that combining a group of predictors can dramatically reduce generalization error by reducing the ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Keywords: Machine Learning, Ensemble, Neural Network, Decomposition, Dynamic Coefficients. A lot of work has been conducted in the field of ensemble methods during the past five years. It has been shown that combining a group of predictors can dramatically reduce generalization error by reducing the variance without increasing the bias. Most of the ensemble methods, e.g. Bagging and AdaBoost use some kind of resampling of the training set to achieve the desired effect. I propose a new kind of ensemble method for regressors, where no resampling is done explicit, but is done implicit by letting the coefficients of the individual members of an ensemble be functions of the input, and train them together with the members of the ensemble. This method is compared to Bagging and single neural networks. The possibility of the coefficients to decompose the input in a reasonable way is investigated. 1 1 Introduction A number of methods using a group of predictors to achieve improved learning h...
Large-Scale Investigation of Weed Seed Identification by Machine Vision
"... We explore the feasibility of implementing fast and reliable computer-based systems for the automatic identification of weed seeds from color and black and white images. Seeds size, shape, color and texture characteristics are obtained by standard image-processing techniques, and their discriminatin ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We explore the feasibility of implementing fast and reliable computer-based systems for the automatic identification of weed seeds from color and black and white images. Seeds size, shape, color and texture characteristics are obtained by standard image-processing techniques, and their discriminating power as classification features is assessed. These investigations are performed on a database much larger than those used in previous studies, containing 10310 images of 236 di#erent weed species. We consider the implementation of a simple Bayesian approach (nave Bayes classifier) and (single and bagged) artificial neural network systems for seed identification. Our results indicate that the nave Bayes classifier based on an adequately selected set of classification features has an excellent performance, competitive with that of the comparatively more sophisticated neural network approach. In addition, we discuss the possibility of using only morphological and textural characteristics as classification features, which would reduce the operational complexity and hardware cost of a commercial system since they can be obtained from black and white images. We find that, under particular operational conditions, this would result in a relatively small loss in performance when compared to the implementation based on color images.
Maximum Feasibility Approach for Consensus Classifiers: Applications to Protein Structure Prediction
"... A novel strategy to optimize consensus classifiers for large classification problems is proposed, based on Linear Programming (LP) techniques and the recently introduced Maximum Feasibility (MaxF) heuristic for solving infeasible LP problems. For a set of classifiers and their normalized class depen ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
A novel strategy to optimize consensus classifiers for large classification problems is proposed, based on Linear Programming (LP) techniques and the recently introduced Maximum Feasibility (MaxF) heuristic for solving infeasible LP problems. For a set of classifiers and their normalized class dependent scores one postulates that the consensus score is a linear combination of individual scores. We require this consensus score to satisfy a set of linear constraints, imposing that the consensus score for the true class be higher than for any other classes.. Additional constraints may be added in order to impose that the margin of separation (difference between the true class score and false classes scores) for the consensus classifier be larger than that of the best individual classifier. Since LP problems defined this way are typically infeasible, approximate solutions with good generalization properties are found using interior point methods for LP in conjunction with the MaxF heuristic. The new technique has been applied to a number of classification problems relevant for protein structure prediction. 1.
Progress Report: Ensemble Methods in Connection with Neural Networks
"... This report presents the work done by me during part A of the Ph.D. programme at the Department of Computer Science, University of Aarhus. The main result is a new ensemble method, called the DynCo ensemble method, which compares favorable with the best known ensemble methods. The usual ensemble met ..."
Abstract
- Add to MetaCart
This report presents the work done by me during part A of the Ph.D. programme at the Department of Computer Science, University of Aarhus. The main result is a new ensemble method, called the DynCo ensemble method, which compares favorable with the best known ensemble methods. The usual ensemble method uses some sort of resampling scheme to generate diverse training sets. The DynCo ensemble method uses dynamical coefficients instead. It is possible for the coefficients to decompose the input space for the ensemble in a reasonable way, thereby lowering the complexity of the problem at hand for each ensemble member. The result is a possible meaningful clustering of input and better learning. It is shown in this report that dynamical coefficients is analogous to a kind of resampling method. The DynCo ensemble method is compared empirically to other ensemble methods.
Combining Regression Estimators: GA-Based Selective Neural Network Ensemble
, 2001
"... Neural network ensemble is a learning paradigm where a collection of neural networks is trained for the same task. In this paper, the relationship between the generalization ability of the neural network ensemble and the correlation of the individual neural networks constituting the ensemble is anal ..."
Abstract
- Add to MetaCart
Neural network ensemble is a learning paradigm where a collection of neural networks is trained for the same task. In this paper, the relationship between the generalization ability of the neural network ensemble and the correlation of the individual neural networks constituting the ensemble is analyzed in the context of combining neural regression estimators, which reveals that ensembling a selective subset of trained networks is superior to ensembling all the trained networks in some cases. Based on such recognition, an approach named GASEN is proposed. GASEN trains a number of individual neural networks at first. Then it assigns random weights to the individual networks and employs a genetic algorithm to evolve those weights so that they can characterize to some extent the importance of the individual networks in constituting an ensemble. Finally it selects an optimum subset of individual networks based on the evolved weights to make up the ensemble. Experimental results show that, comparing with a popular ensemble approach, i.e. averaging all, and a theoretically optimum selective ensemble approach, i.e. enumerating, GASEN has preferable performance in generating ensembles with strong generalization ability in relatively small computational cost. This paper also analyzes the working mechanism of GASEN from the view of error-ambiguity decomposition, which reveals that GASEN improves generalization ability mainly through reducing the average generalization error of the individual neural networks constituting the ensemble.
International Journal of Pattern Recognition and Artificial Intelligence
"... erformance when compared to the implementation based on color images. Keywords: machine vision; seed identification; classification; artificial neural networks. 1. Introduction The process of manual identification of seeds by specialized technicians is slow, have low reproducibility, and possess ..."
Abstract
- Add to MetaCart
erformance when compared to the implementation based on color images. Keywords: machine vision; seed identification; classification; artificial neural networks. 1. Introduction The process of manual identification of seeds by specialized technicians is slow, have low reproducibility, and possess a degree of subjectivity hard to quantify, both in their commercial as well as in their technological implications. It is then of major technical and economical importance to implement computer-based methods for reliable and fast identification and classification of seeds. Automatic systems can be based on seed images, from which classification features associated to seed size, shape, color and texture (i.e., greytone variations on the surface) are readily obtained. Thus, the field of machine vision, i.e., image-processing algorithms complemented with classification methods, seems a suitable framework for automatic seed identification. Most previous attempts to identify seeds by machine vi

