Results 1  10
of
48
Cached Sufficient Statistics for Efficient Machine Learning with Large Datasets
 Journal of Artificial Intelligence Research
, 1997
"... This paper introduces new algorithms and data structures for quick counting for machine learning datasets. We focus on the counting task of constructing contingency tables, but our approach is also applicable to counting the number of records in a dataset that match conjunctive queries. Subject to c ..."
Abstract

Cited by 122 (19 self)
 Add to MetaCart
This paper introduces new algorithms and data structures for quick counting for machine learning datasets. We focus on the counting task of constructing contingency tables, but our approach is also applicable to counting the number of records in a dataset that match conjunctive queries. Subject to certain assumptions, the costs of these operations can be shown to be independent of the number of records in the dataset and loglinear in the number of nonzero entries in the contingency table. We provide a very sparse data structure, the ADtree, to minimize memory use. We provide analytical worstcase bounds for this structure for several models of data distribution. We empirically demonstrate that tractablysized data structures can be produced for large realworld datasets by (a) using a sparse tree structure that never allocates memory for counts of zero, (b) never allocating memory for counts that can be deduced from other counts, and (c) not bothering to expand the tree fully near its...
An incremental node embedding technique for error correcting output codes
, 2008
"... The error correcting output codes (ECOC) technique is a useful way to extend any binary classifier to the multiclass case. The design of an ECOC matrix usually considers an a priori fixed number of dichotomizers. We argue that the selection and number of dichotomizers must depend on the performance ..."
Abstract

Cited by 9 (6 self)
 Add to MetaCart
The error correcting output codes (ECOC) technique is a useful way to extend any binary classifier to the multiclass case. The design of an ECOC matrix usually considers an a priori fixed number of dichotomizers. We argue that the selection and number of dichotomizers must depend on the performance of the ensemble code in relation to the problem domain. In this paper, we present a novel approach that improves the performance of any initial output coding by extending it in a suboptimal way. The proposed strategy creates the new dichotomizers by minimizing the confusion matrix among classes guided by a validation subset. A weighted methodology is proposed to take into account the different relevance of each dichotomizer. As a result, overfitting is avoided and small codes with good generalization performance are obtained. In the decoding step, we introduce a new strategy that follows the principle that positions coded with the symbol zero should have small influence in the results. We compare our strategy to other wellknown ECOC strategies on the UCI database, and the results show it represents a significant improvement.
The Review of Problems Solvable by Algorithms of the Group Method of Data Handling (GMDH)
, 1995
"... This paper describes the use of algorithms of the Group Method of Data Handling (GMDH) in solving various problems of experimental data processing..4 spectrum of parametric (polynomial) algorithms and of nonparametric algorithms using clusterings or analogues was developed. The choice of an algor ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
This paper describes the use of algorithms of the Group Method of Data Handling (GMDH) in solving various problems of experimental data processing..4 spectrum of parametric (polynomial) algorithms and of nonparametric algorithms using clusterings or analogues was developed. The choice of an algorithm for practical use depends on the type of the problem, the level of noise variance, sufficiency of sampling, and on whether the sample contains only continuous variables. Basic problems solvable by the GMDH are listed, including: identification of physical laws, approximation ofmultidimensionalprocesses, shortterm stepwise forecasting of processes and events, longterm stepwise forecasting, extrapolation ofphysicalfields, clustering of data samples and search for a physical clustering that corresponds to the physical model of an object, pattern recognition in the case of continuous or discrete variables, diagnostics and recognition using probabilistic sorting algorithms, selforganization ofmultilayered neural nets with active neurons, normarive vector forecasting of processes, process forecasting without models using analogues complexing.
Active Learning in Discrete Input Spaces
 In Proceedings of the 34th Interface Symposium
, 2002
"... Traditional design of experiments (DOE) from the statistics literature focuses on optimizing an output parameter over a space of continuous input parameters. Here we consider DOE, or active learning, for discrete input spaces. A trivial example of this is the karmed bandit problem, which is the ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
Traditional design of experiments (DOE) from the statistics literature focuses on optimizing an output parameter over a space of continuous input parameters. Here we consider DOE, or active learning, for discrete input spaces. A trivial example of this is the karmed bandit problem, which is the case of having a single input attribute of arity k. We address the full problem of many attributes where it is impossible to test every combination of attributevalue pairs even once within the given number of experiments, but we expect to be able to generalize on the results of experiments.
Realvalued alldimensions search: Lowoverhead rapid searching over subsets of attributes
 Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence
, 2002
"... This paper is about searching the combinatorial space of contingency tables during the inner loop of a nonlinear statistical optimization. Examples of this operation in various data analytic communities include searching for nonlinear combinations of attributes that contribute signicantly to a regre ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
This paper is about searching the combinatorial space of contingency tables during the inner loop of a nonlinear statistical optimization. Examples of this operation in various data analytic communities include searching for nonlinear combinations of attributes that contribute signicantly to a regression (Statistics), searching for items to include in a decision list (machine learning) and association rule hunting (Data Mining). This paper investigates a new, efficient approach to this class of problems, called RADSEARCH (Realvalued AllDimensionstree Search). RADSEARCH finds the global optimum, and this gives us the opportunity to empirically evaluate the question: apart from algorithmic elegance what does this attention to optimality buy us? We compare RADSEARCH with other recent successful search algorithms such as CN2, PRIM, APriori, OPUS and DenseMiner. Finally, we introduce RADREG, a new regression algorithm for learning realvalued outputs based on RADSEARCHing for highorder interactions.
A New Methodology for Emergent System Identification Using Particle Swarm Optimization (PSO) and Group Method of Data Handling (GMDH)
 in GECCO 2002: Proceedings of the Genetic and Evolutionary Computation Conference
, 2002
"... A new methodology for Emergent System Identification is proposed in this paper. The new method applies the selforganizing Group Method of Data Handling (GMDH) functional networks, Particle Swarm Optimization (PSO), and Genetic Programming (GP) that is effective in identifying complex dynamic ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
A new methodology for Emergent System Identification is proposed in this paper. The new method applies the selforganizing Group Method of Data Handling (GMDH) functional networks, Particle Swarm Optimization (PSO), and Genetic Programming (GP) that is effective in identifying complex dynamic systems. The focus of the paper will be on how Particle Swarm Optimization (PSO) is applied within Group Method of Data Handling (GMDH) which is used as the modeling framework.
A Learning Algorithm for Evolving Cascade Neural Networks
 Neural Processing Letters(17
, 2003
"... A new learning algorithm for Evolving Cascade Neural Networks (ECNNs) is described. An ECNN starts to learn with one input node and then adding new inputs as well as new hidden neurons evolves it. The trained ECNN has a nearly minimal number of input and hidden neurons as well as connections. The al ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
A new learning algorithm for Evolving Cascade Neural Networks (ECNNs) is described. An ECNN starts to learn with one input node and then adding new inputs as well as new hidden neurons evolves it. The trained ECNN has a nearly minimal number of input and hidden neurons as well as connections. The algorithm was successfully applied to classify artifacts and normal segments in clinical electroencephalograms (EEGs). The EEG segments were visually labeled by EEGviewer. The trained ECNN has correctly classified 96.69% of the testing segments. It is slightly better than a standard fully connected neural network.
Polynomial Neural Networks Learnt to Classify EEG Signals
 Advanced Study Institute on Neural Networks for Instrumentation, Measurement, and Related Industrial Applications
, 2001
"... A neural network based technique is presented, which is able to successfully extract polynomial classification rules from labeled electroencephalogram (EEG) signals. To represent the classification rules in an analytical form, we use the polynomial neural networks trained by a modified Group Method ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
A neural network based technique is presented, which is able to successfully extract polynomial classification rules from labeled electroencephalogram (EEG) signals. To represent the classification rules in an analytical form, we use the polynomial neural networks trained by a modified Group Method of Data Handling (GMDH). The classification rules were extracted from clinical EEG data that were recorded from an Alzheimer patient and the sudden death risk patients. The third data is EEG recordings that include the normal and artifact segments. These EEG data were visually identified by medical experts. The extractedpolynomial rules verified on the testing EEG data allow to correctly classify 72% of the risk group patients and 96.5% of the segments. These rules performs slightly better than standard feedforward neural networks.
Financial Modelling using Social Programming
, 2003
"... This paper introduces Social Programming for use in predicting closing stock prices. Social Programming is a new methodology for creating Complex Adaptive Functional Networks that is based on a socialpsychological metaphor. Social Programming is demonstrated to be a logical extension of the Particl ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
This paper introduces Social Programming for use in predicting closing stock prices. Social Programming is a new methodology for creating Complex Adaptive Functional Networks that is based on a socialpsychological metaphor. Social Programming is demonstrated to be a logical extension of the Particle Swarm methodology, the Group Method of Data Handling and Cartesian Programming. The Social Programming algorithm was able to predict closing stock prices more effectively than the traditional Group Method of Data Handling. The results in this paper illustrate the potential of the Social Programming methodology for use in financial modelling.
Feature ranking derived from data mining process
, 2008
"... Abstract. Most common feature ranking methods are based on the statistical approach. This paper compare several statistical methods with new method for feature ranking derived from data mining process. This method ranks features depending on percentage of child units that survived the selection proc ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Abstract. Most common feature ranking methods are based on the statistical approach. This paper compare several statistical methods with new method for feature ranking derived from data mining process. This method ranks features depending on percentage of child units that survived the selection process. A child unit is a processing element transforming the parent input features to the output. After training, units are interconnected in the feedforward hybrid neural network called GAME. The selection process is realized by means of niching genetic algorithm, where units connected to least significant features starve and fade from population. Parameters of new feature ranking algorithm are investigated and comparison among different methods is presented on well known real world and artificial data sets. 1