Results 1 - 10
of
48
Wrappers for feature subset selection
- ARTIFICIAL INTELLIGENCE
, 1997
"... In the feature subset selection problem, a learning algorithm is faced with the problem of selecting a relevant subset of features upon which to focus its attention, while ignoring the rest. To achieve the best possible performance with a particular learning algorithm on a particular training set, a ..."
Abstract
-
Cited by 775 (3 self)
- Add to MetaCart
In the feature subset selection problem, a learning algorithm is faced with the problem of selecting a relevant subset of features upon which to focus its attention, while ignoring the rest. To achieve the best possible performance with a particular learning algorithm on a particular training set, a feature subset selection method should consider how the algorithm and the training set interact. We explore the relation between optimal feature subset selection and relevance. Our wrapper method searches for an optimal feature subset tailored to a particular algorithm and a domain. We study the strengths and weaknesses of the wrapper approach and show a series of improved designs. We compare the wrapper approach to induction without feature subset selection and to Relief, a filter approach to feature subset selection. Significant improvement in accuracy is achieved for some datasets for the two families of induction algorithms used: decision trees and
Evaluating the Quality of Approximations to the Non-Dominated Set
, 1998
"... : The growing interest in hard multiple objective combinatorial and non-linear problems resulted in a significant number of heuristic methods aiming at generating sets of feasible solutions as approximations to the set of non-dominated solutions. The issue of evaluating these approximations is addre ..."
Abstract
-
Cited by 45 (5 self)
- Add to MetaCart
: The growing interest in hard multiple objective combinatorial and non-linear problems resulted in a significant number of heuristic methods aiming at generating sets of feasible solutions as approximations to the set of non-dominated solutions. The issue of evaluating these approximations is addressed. Such evaluations are useful when performing experimental comparisons of different multiple objective heuristic algorithms, when defining stopping rules of multiple objective heuristic algorithms, and when adjusting parameters of heuristic algorithms to a given problem. A family of outperformance relations that can be used to compare approximations under very weak assumptions about a decision-maker's preferences is introduced. These outperformance relations define incomplete orders in the set of all approximations. It is shown that in order to compare approximations, which are incomparable according to the outperformance relations, much stronger assumptions about the decision-maker's p...
Data Perturbation for Escaping Local Maxima in Learning
- IN AAAI
, 2002
"... Almost all machine learning algorithms---be they for regression, classification or density estimation---seek hypotheses that optimize a score on training data. In most interesting cases, however, full global optimization is not feasible and local search techniques are used to discover reasonable ..."
Abstract
-
Cited by 29 (3 self)
- Add to MetaCart
Almost all machine learning algorithms---be they for regression, classification or density estimation---seek hypotheses that optimize a score on training data. In most interesting cases, however, full global optimization is not feasible and local search techniques are used to discover reasonable solutions. Unfortunately,
Empirical Performance Evaluation Methodology and Its Application to Page Segmentation Algorithms
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2001
"... this paper, we use the following five-step methodology to quantitatively compare the performance of page segmentation algorithms: 1) First, we create mutually exclusive training and test data sets with groundtruth, 2) we then select a meaningful and computable performance metric, 3) an optimizatio ..."
Abstract
-
Cited by 26 (5 self)
- Add to MetaCart
this paper, we use the following five-step methodology to quantitatively compare the performance of page segmentation algorithms: 1) First, we create mutually exclusive training and test data sets with groundtruth, 2) we then select a meaningful and computable performance metric, 3) an optimization procedure is then used to search automatically for the optimal parameter values of the segmentation algorithms on the training data set, 4) the segmentation algorithms are then evaluated on the test data set, and, finally, 5) a statistical and error analysis is performed to give the statistical significance of the experimental results. In particular, instead of the ad hoc and manual approach typically used in the literature for training algorithms, we pose the automatic training of algorithms as an optimization problem and use the Simplex algorithm to search for the optimal parameter value. A paired-model statistical analysis and an error analysis are then conducted to provide confidence intervals for the experimental results of the algorithms. This methodology is applied to the evaluation of five page segmentation algorithms of which, three are representative research algorithms and the other two are well-known commercial products, on 978 images from the University of Washington III data set. It is found that the performance indices (average textline accuracy) of the Voronoi, Docstrum, and Caere segmentation algorithms are not significantly different from each other, but they are significantly better than that of ScanSoft's segmentation algorithm, which, in turn, is significantly better than that of X-Y cut
Image classification using Markov Random Fields with two new relaxation methods: Deterministic pseudo . . .
, 1991
"... In this paper, we present two relaxation techniques: Deterministic Pseudo-Annealing (DPA) and Modified Metropolis Dynamics (MMD) in order to do image classification using a Markov Random Field modelization. For the first algorithm (DPA), the a posteriori probability of a tentative labeling is genera ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
In this paper, we present two relaxation techniques: Deterministic Pseudo-Annealing (DPA) and Modified Metropolis Dynamics (MMD) in order to do image classification using a Markov Random Field modelization. For the first algorithm (DPA), the a posteriori probability of a tentative labeling is generalized to continuous labeling. The merit function thus defined has the same maxima under constraints yielding probability vectors. Changing these constraints convexify the merit function. The algorithm solve this unambigous maximization problem and then tracks down the solution while the original constraints are restored yielding a good even if suboptimal solution to the original labeling assignment problem. As for the second method (MMD), it is a modified version of the Metropolis algorithm: at each iteration the new state is chosen randomly but the decision to accept it is purely deterministic. This is of course also a suboptimal technique which gives faster results than stochastic relaxation. These two methods have been implemented on a Connection Machine CM2 and simulation results are shown with a synthetic noisy image and a SPOT image. These results are compared to those obtained with the Metropolis algorithm, the Gibbs sampler and ICM (Iterated Conditional Mode).
Guided Local Search - An Illustrative Example in Function Optimisation
- In BT Technology Journal, Vol.16, No.3
, 1998
"... The Guided Local Search method has been successfully applied to a number of hard combinatorial optimisation problems from the well-known TSP and QAP to real world problems such as Frequency Assignment and Workforce Scheduling. In this paper, we are demonstrating that the potential applications of GL ..."
Abstract
-
Cited by 13 (4 self)
- Add to MetaCart
The Guided Local Search method has been successfully applied to a number of hard combinatorial optimisation problems from the well-known TSP and QAP to real world problems such as Frequency Assignment and Workforce Scheduling. In this paper, we are demonstrating that the potential applications of GLS are not limited to optimisation problems of discrete nature but also to difficult continuous optimisation problems. Continuous optimisation problems arise in many engineering disciplines (such as electrical and mechanical engineering) in the context of analysis, design or simulation tasks. The problem examined gives an illustrative example of the behaviour of GLS, providing insights on the mechanisms of the algorithm. 1.
Transforming Structural Model to Runtime Model of Embedded Software with Real-Time Constraints
- In: In proceeding of Design, Automation and Test in Europe Conference and Exhibition, IEEE
, 1995
"... The model-based methodology has proven to be effective for fast and low-cost development of embedded software. In the model-based development process, transforming a software structural model that describes the underlying application, to an implementable runtime model is a critical issue. Since the ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
The model-based methodology has proven to be effective for fast and low-cost development of embedded software. In the model-based development process, transforming a software structural model that describes the underlying application, to an implementable runtime model is a critical issue. Since the designed software will finally run on the target platform, non-functional issues like schedulability, timing constraints and resource requirements have to be considered during the transformation. In this paper, we propose a generic runtime model architecture that can best satisfy the non-functional requirements of the system, and a generic transformation method to convert a structural model to a runtime model in such an architecture. The transformation approach is based on the notion of end-toend computations performed by the system in response to external stimuli. We demonstrate the advantages and effectiveness of the proposed method by constructing a software runtime model for a combined electronic throttle and airfuel ratio control system.
Parallel Strategies for Meta-heuristics
"... We present a state-of-the-art survey of parallel meta-heuristic developments and results, discuss general design and implementation principles that apply to most meta-heuristic classes, instantiate these principles for the three meta-heuristic classes currently most extensively used - genetic metho ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
We present a state-of-the-art survey of parallel meta-heuristic developments and results, discuss general design and implementation principles that apply to most meta-heuristic classes, instantiate these principles for the three meta-heuristic classes currently most extensively used - genetic methods, simulated annealing, and tabu search, and identify a number of trends and promising research directions.
A Methodology for Empirical Performance Evaluation of Page Segmentation Algorithms
- IN PROCEEDINGS OF SPIE CONFERENCE ON DOCUMENT RECOGNITION AND RETRIEVAL
, 1999
"... Document page segmentation is a crucial preprocessing step in Optical Character Recognition (OCR) systems. While numerous page segmentation algorithms have been proposed, there is relatively less literature on comparative evaluation --- empirical or theoretical --- of these algorithms. For the exist ..."
Abstract
-
Cited by 9 (6 self)
- Add to MetaCart
Document page segmentation is a crucial preprocessing step in Optical Character Recognition (OCR) systems. While numerous page segmentation algorithms have been proposed, there is relatively less literature on comparative evaluation --- empirical or theoretical --- of these algorithms. For the existing performance evaluation methods, two crucial components are usually missing: 1) automatic training of algorithms with free parameters and 2) statistical and error analysis of experimental results. In this thesis, we use the following five-step methodology to quantitatively compare the performance of page segmentation algorithms: 1) First we create mutually exclusive training and test datasets with groundtruth, 2) we then select a meaningful and computable performance metric, 3) an optimization procedure is then used to search automatically for the optimal parameter values of the segmentation algorithms, 4) the segmentation algorithms are then evaluated on the test dataset, and finally 5) ...

