Results 1  10
of
24
An introduction to variable and feature selection
 Journal of Machine Learning Research
, 2003
"... Variable and feature selection have become the focus of much research in areas of application for which datasets with tens or hundreds of thousands of variables are available. ..."
Abstract

Cited by 688 (14 self)
 Add to MetaCart
Variable and feature selection have become the focus of much research in areas of application for which datasets with tens or hundreds of thousands of variables are available.
Relevance Feedback using Support Vector Machines
 In Proceedings of the 18th International Conference on Machine Learning
, 2001
"... We show that support vectors machines ..."
A unifying framework for computational reinforcement learning theory
, 2009
"... Computational learning theory studies mathematical models that allow one to formally analyze and compare the performance of supervisedlearning algorithms such as their sample complexity. While existing models such as PAC (Probably Approximately Correct) have played an influential role in understand ..."
Abstract

Cited by 18 (6 self)
 Add to MetaCart
Computational learning theory studies mathematical models that allow one to formally analyze and compare the performance of supervisedlearning algorithms such as their sample complexity. While existing models such as PAC (Probably Approximately Correct) have played an influential role in understanding the nature of supervised learning, they have not been as successful in reinforcement learning (RL). Here, the fundamental barrier is the need for active exploration in sequential decision problems. An RL agent tries to maximize longterm utility by exploiting its knowledge about the problem, but this knowledge has to be acquired by the agent itself through exploring the problem that may reduce shortterm utility. The need for active exploration is common in many problems in daily life, engineering, and sciences. For example, a Backgammon program strives to take good moves to maximize the probability of winning a game, but sometimes it may try novel and possibly harmful moves to discover how the opponent reacts in the hope of discovering a better gameplaying strategy. It has been known since the early days of RL that a good tradeoff between exploration and exploitation is critical for the agent to learn fast (i.e., to reach nearoptimal strategies
On real Turing machines that toss coins
, 1995
"... In this paper we consider real counterparts of classical probabilistic complexity classes in the framework of real Turing machines as introduced by Blum, Shub, and Smale [2]. We give an extension of the wellknown "BPP ` P=poly" result from discrete complexity theory to a very general setting in the ..."
Abstract

Cited by 14 (8 self)
 Add to MetaCart
In this paper we consider real counterparts of classical probabilistic complexity classes in the framework of real Turing machines as introduced by Blum, Shub, and Smale [2]. We give an extension of the wellknown "BPP ` P=poly" result from discrete complexity theory to a very general setting in the real number model. This result holds for real inputs, real outputs, and random elements drawn from an arbitrary probability distribution over IR m . Then we turn to the study of Boolean parts, that is, classes of languages of zeroone vectors accepted by real machines. In particular we show that the classes BPP , PP , PH, and PSPACE are not enlarged by allowing the use of real constants and arithmetic at unit cost provided we restrict branching to equality tests. Introduction 3 1 Introduction This paper deals with probabilistic complexity classes in the real number model of computation. We consider both uniform and nonuniform models. The classical nonuniform models are straightline ...
Using Finite Automata to Mine Execution Data for Intrusion Detection: a Preliminary Report
 In Recent Advances in Intrusion Detection (RAID
, 2000
"... The use of program execution traces to detect intrusions has proven to be a successful strategy. Existing systems that employ this approach are anomaly detectors, meaning that they model a program's normal behavior and signal deviations from that behavior. Unfortunately, many programbased exploits ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
The use of program execution traces to detect intrusions has proven to be a successful strategy. Existing systems that employ this approach are anomaly detectors, meaning that they model a program's normal behavior and signal deviations from that behavior. Unfortunately, many programbased exploits of NT systems use specialized malicious executables. Anomaly detection systems cannot deal with such programs because there is no standard of \normalcy" that they deviate from.
SemiBoost: Boosting for SemiSupervised Learning
 IEEE Trans. Pattern Analysis and Machine Intelligence
, 2009
"... Semisupervised learning has attracted a significant amount of attention in machine learning. Most previous studies have focused on designing special algorithms to effectively exploit the unlabeled data. Our goal is to improve the classification accuracy of any given supervised learning algorithm by ..."
Abstract

Cited by 12 (2 self)
 Add to MetaCart
Semisupervised learning has attracted a significant amount of attention in machine learning. Most previous studies have focused on designing special algorithms to effectively exploit the unlabeled data. Our goal is to improve the classification accuracy of any given supervised learning algorithm by using the available unlabeled examples. This problem is particularly important when we need to train a handcrafted, supervised learning algorithm with a limited number of labeled examples and a multitude of unlabeled examples. We present a boosting framework for semisupervised learning, termed as SemiBoost. Our empirical study on 21 different datasets demonstrates that the proposed framework is effective for improving the performance of several supervised learning algorithms given a large number of unlabeled examples. We also show that our algorithm, SemiBoost often outperforms stateoftheart semisupervised learning algorithms. 1.
Maximum Margin Coresets for Active and Noise Tolerant Learning
 Proc. of the International Joint Conference on Artificial Intelligence (IJCAI
, 2006
"... We study the problem of learning large margin halfspaces in various settings using coresets to show that coresets are a widely applicable tool for large margin learning. A large margin coreset is a subset of the input data sufficient for approximating the true maximum margin solution. In this work, ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
We study the problem of learning large margin halfspaces in various settings using coresets to show that coresets are a widely applicable tool for large margin learning. A large margin coreset is a subset of the input data sufficient for approximating the true maximum margin solution. In this work, we provide a direct algorithm and analysis for constructing large margin coresets. We show various applications including a novel coreset based analysis of large margin active learning and a polynomial time (in the number of input data and the amount of noise) algorithm for agnostic learning in the presence of outlier noise. We also highlight a simple extension to multiclass classification problems and structured output learning. 1
Noise Injection: theoretical prospects
, 1997
"... Noise Injection consists in adding noise to the inputs during neural network training. Experimental results suggest that it might improve the generalization ability of the resulting neural network. A justification of this improvement remains elusive: first, describing analytically the average pertur ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
Noise Injection consists in adding noise to the inputs during neural network training. Experimental results suggest that it might improve the generalization ability of the resulting neural network. A justification of this improvement remains elusive: first, describing analytically the average perturbed cost function is difficult, second, controlling the fluctuations of the random perturbed cost function is hard. Hence recent papers suggest to replace the random perturbed cost by a (deterministic) Taylor approximation of the average perturbed cost function. This paper takes a different stance: when the injected noise is Gaussian, Noise Injection is naturally connected to the action of the Heat Kernel. This provides indications on the relevance domain of traditional Taylor expansions, and shows the dependence of the quality of Taylor approximations on global smoothness properties of neural networks under consideration. The connection between noise injection and Heat kernel also enables to control the fluctuations of the random perturbed cost function. Under the previously mentioned global smoothness assumption, tools from Gaussian analysis provide bounds on the tail behavior of the perturbed cost. This finally suggests that mixing input perturbation with smoothness based penalization might be profitable.
Spectral Algorithms for Supervised Learning
, 2007
"... We discuss how a large class of regularization methods, collectively known as spectral regularization and originally designed for solving illposed inverse problems, gives rise to regularized learning algorithms. All these algorithms are consistent kernel methods which can be easily implemented. The ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
We discuss how a large class of regularization methods, collectively known as spectral regularization and originally designed for solving illposed inverse problems, gives rise to regularized learning algorithms. All these algorithms are consistent kernel methods which can be easily implemented. The intuition behind their derivation is that the same principle allowing to numerically stabilize a matrix inversion problem