Results 1 - 10
of
12
The WEKA Data Mining Software: An Update
"... More than twelve years have elapsed since the first public release of WEKA. In that time, the software has been rewritten entirely from scratch, evolved substantially and now accompanies a text on data mining [35]. These days, WEKA enjoys widespread acceptance in both academia and business, has an a ..."
Abstract
-
Cited by 175 (6 self)
- Add to MetaCart
More than twelve years have elapsed since the first public release of WEKA. In that time, the software has been rewritten entirely from scratch, evolved substantially and now accompanies a text on data mining [35]. These days, WEKA enjoys widespread acceptance in both academia and business, has an active community, and has been downloaded more than 1.4 million times since being placed on Source-Forge in April 2000. This paper provides an introduction to the WEKA workbench, reviews the history of the project, and, in light of the recent 3.6 stable release, briefly discusses what has been added since the last stable version (Weka 3.4) released in 2003. 1.
Logistic Model Trees
, 2006
"... Tree induction methods and linear models are popular techniques for supervised learning tasks, both for the prediction of nominal classes and numeric values. For predicting numeric quantities, there has been work on combining these two schemes into ‘model trees’, i.e. trees that contain linear regr ..."
Abstract
-
Cited by 62 (2 self)
- Add to MetaCart
Tree induction methods and linear models are popular techniques for supervised learning tasks, both for the prediction of nominal classes and numeric values. For predicting numeric quantities, there has been work on combining these two schemes into ‘model trees’, i.e. trees that contain linear regression functions at the leaves. In this paper, we present an algorithm that adapts this idea for classification problems, using logistic regression instead of linear regression. We use a stagewise fitting process to construct the logistic regression models that can select relevant attributes in the data in a natural way, and show how this approach can be used to build the logistic regression models at the leaves by incrementally refining those constructed at higher levels in the tree. We compare the performance of our algorithm to several other state-of-the-art learning schemes on 36 benchmark UCI datasets, and show that it produces accurate and compact classifiers.
Accurate decision trees for mining high-speed data streams
- In Proc. SIGKDD
, 2003
"... In this paper we study the problem of constructing accurate decision tree models from data streams. Data streams are incremental tasks that require incremental, online, and any-time learning algorithms. One of the most successful algorithms for mining data streams is VFDT. In this paper we extend th ..."
Abstract
-
Cited by 22 (6 self)
- Add to MetaCart
In this paper we study the problem of constructing accurate decision tree models from data streams. Data streams are incremental tasks that require incremental, online, and any-time learning algorithms. One of the most successful algorithms for mining data streams is VFDT. In this paper we extend the VFDT system in two directions: the ability to deal with continuous data and the use of more powerful classification techniques at tree leaves. The proposed system, VFDTc, can incorporate and classify new information online, with a single scan of the data, in time constant per example. The most relevant property of our system is the ability to obtain a performance similar to a standard decision tree algorithm even for medium size datasets. This is relevant due to the any-time property. We study the behaviour of VFDTc in different problems and demonstrate its utility in large and medium data sets. Under a bias-variance analysis we observe that VFDTc in comparison to C4.5 is able to reduce the variance component.
Robot Reinforcement Learning using EEG-based reward signals
"... Abstract — Reinforcement learning algorithms have been successfully applied in robotics to learn how to solve tasks based on reward signals obtained during task execution. These reward signals are usually modeled by the programmer or provided by supervision. However, there are situations in which th ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Abstract — Reinforcement learning algorithms have been successfully applied in robotics to learn how to solve tasks based on reward signals obtained during task execution. These reward signals are usually modeled by the programmer or provided by supervision. However, there are situations in which this reward is hard to encode, and so would require a supervised approach of reinforcement learning, where a user directly types the reward on each trial. This paper proposes to use brain activity recorded by an EEG-based BCI system as reward signals. The idea is to obtain the reward from the activity generated while observing the robot solving the task. This process does not require an explicit model of the reward signal. Moreover, it is possible to capture subjective aspects which are specific to each user. To achieve this, we designed a new protocol to use brain activity related to the correct or wrong execution of the task. We showed that it is possible to detect and classify different levels of error in single trials. We also showed that it is possible to apply reinforcement learning algorithms to learn new similar tasks using the rewards obtained from brain activity. I.
EMPRR: A High-dimensional EM-Based Piecewise Regression Algorithm
, 2003
"... We propose a novel general piecewise surface regression model that allows for arbi-trary functions to be used in each piece, and arbitrary boundary surfaces between pieces. We also give an EM-based algorithm for this model, EMPRR, that scales to high dimensions. We compare EMPRR’s performance with t ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We propose a novel general piecewise surface regression model that allows for arbi-trary functions to be used in each piece, and arbitrary boundary surfaces between pieces. We also give an EM-based algorithm for this model, EMPRR, that scales to high dimensions. We compare EMPRR’s performance with those of model trees and functional trees, two regression tree learning methods, on synthetic piecewise data and benchmark data sets. Our results show that EMPRR outperforms the other two methods on the synthetic data sets and performs competitively on the benchmark data sets while generating accurate and much more compact models. Acknowledgements I would like to thank my advisor Prof. Stephen Scott first of all for making me a computer scientist. He took a risk of hiring a student with biotechnology background as a research assistant in computer science and I hope I have justified his decision in the past two and a half years. He has been more a friend to me than an advisor, making sure that things are fine for me during my stay in Lincoln. I would also like to thank him for his invaluable guidance throughout my research and when I was writing
Model Selection in Omnivariate Decision Trees
"... Abstract. We propose an omnivariate decision tree architecture which contains univariate, multivariate linear or nonlinear nodes, matching the complexityofthenodetothecomplexityofthedatareachingthatnode. We compare the use of different model selection techniques including AIC, BIC, and CV to choose ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. We propose an omnivariate decision tree architecture which contains univariate, multivariate linear or nonlinear nodes, matching the complexityofthenodetothecomplexityofthedatareachingthatnode. We compare the use of different model selection techniques including AIC, BIC, and CV to choose between the three types of nodes on standard datasets from the UCI repository and see that such omnivariate trees with a small percentage of multivariate nodes close to the root generalize better than pure trees with the same type of node everywhere. CV produces simpler trees than AIC and BIC without sacrificing from expected error. The only disadvantage of CV is its longer training time. 1
A Cooperative Approach for Handshake Detection based on Body Sensor Networks
"... Abstract—The handshake gesture is an important part of the social etiquette in many cultures. It lies at the core of many human interactions, either in formal or informal settings: exchanging greetings, offering congratulations, and finalizing a deal are all activities that typically either start or ..."
Abstract
- Add to MetaCart
Abstract—The handshake gesture is an important part of the social etiquette in many cultures. It lies at the core of many human interactions, either in formal or informal settings: exchanging greetings, offering congratulations, and finalizing a deal are all activities that typically either start or finish with a handshake. The automated detection of a handshake can enable wide range of pervasive computing scanarios; in particular, different types of information can be exchanged and processed among the handshaking persons, depending on the physical/logical contexts where they are located and on their mutual acquaintance. This paper proposes a novel handshake detection system based on body sensor networks consisting of a resource-constrained wristwearable sensor node and a more capable base station. The system uses an effective collaboration technique among body sensor networks of the handshaking persons which minimizes errors associated with the application of classification algorithms and improves the overall accuracy in terms of the number of false positives and false negatives.
Workshop on Applications of Pattern Analysis Cross-associating
"... unlabelled timbre distributions to create expressive musical mappings ..."
22 Evolutionary Algorithms in Decision Tree Induction
"... One of the biggest problem that many data analysis techniques have to deal with nowadays is Combinatorial Optimization that, in the past, has led many methods to be taken apart. Actually, the (still not enough!) higher computing power available makes it possible to apply such techniques within certa ..."
Abstract
- Add to MetaCart
One of the biggest problem that many data analysis techniques have to deal with nowadays is Combinatorial Optimization that, in the past, has led many methods to be taken apart. Actually, the (still not enough!) higher computing power available makes it possible to apply such techniques within certain bounds. Since other research fields like Artificial
Development of Discriminant Analysis and Majority- Voting Based Credit Risk Assessment Classifier
"... Abstract- This article presents a research on a method for credit risk evaluation combining expert majority-based ensemble voting scheme together with discriminant analysis as basis for expert formation and popular machine learning techniques for classification, such as decision trees, rulebased ind ..."
Abstract
- Add to MetaCart
Abstract- This article presents a research on a method for credit risk evaluation combining expert majority-based ensemble voting scheme together with discriminant analysis as basis for expert formation and popular machine learning techniques for classification, such as decision trees, rulebased inducers and neural networks. Both single expert and multiple expert evaluations were applied as basis for forming output classes dynamically. Feature selection was applied using correlation-based feature subset evaluator with tabu search. The experiment results form a basis for further research of similar method.

