• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

ROC graphs: notes and practical considerations for data mining researchers (2003)

by T Fawcett
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 82
Next 10 →

Large scale multiple kernel learning

by Sören Sonnenburg, Gunnar Rätsch , Bernhard Schölkopf , Gunnar Rätsch - JOURNAL OF MACHINE LEARNING RESEARCH , 2006
"... While classical kernel-based learning algorithms are based on a single kernel, in practice it is often desirable to use multiple kernels. Lanckriet et al. (2004) considered conic combinations of kernel matrices for classification, leading to a convex quadratically constrained quadratic program. We s ..."
Abstract - Cited by 129 (13 self) - Add to MetaCart
While classical kernel-based learning algorithms are based on a single kernel, in practice it is often desirable to use multiple kernels. Lanckriet et al. (2004) considered conic combinations of kernel matrices for classification, leading to a convex quadratically constrained quadratic program. We show that it can be rewritten as a semi-infinite linear program that can be efficiently solved by recycling the standard SVM implementations. Moreover, we generalize the formulation and our method to a larger class of problems, including regression and one-class classification. Experimental results show that the proposed algorithm works for hundred thousands of examples or hundreds of kernels to be combined, and helps for automatic model selection, improving the interpretability of the learning result. In a second part we discuss general speed up mechanism for SVMs, especially when used with sparse feature maps as appear for string kernels, allowing us to train a string kernel SVM on a 10 million real-world splice data set from computational biology. We integrated multiple kernel learning in our machine learning toolbox SHOGUN for which the source code is publicly available at

A Simple Relational Classifier

by Sofus A. Macskassy, Foster Provost - Proceedings of the Second Workshop on Multi-Relational Data Mining (MRDM-2003) at KDD-2003 , 2003
"... We analyze a Relational Neighbor (RN) classifier, a simple relational predictive model that predicts only based on class labels of related neighbors, using no learning and no inherent attributes. We show that it performs surprisingly well by comparing it to more complex models such as Probabilist ..."
Abstract - Cited by 58 (13 self) - Add to MetaCart
We analyze a Relational Neighbor (RN) classifier, a simple relational predictive model that predicts only based on class labels of related neighbors, using no learning and no inherent attributes. We show that it performs surprisingly well by comparing it to more complex models such as Probabilistic Relational Models and Relational Probability Trees on three data sets from published work.

From uncertainty to belief: Inferring the specification within

by Ted Kremenek, Paul Twohey, Godmar Back, Andrew Ng, Dawson Engler - In ”Proceedings of the Seventh Symposium on Operating Systems Design and Implemetation , 2006
"... Automatic tools for finding software errors require a set of specifications before they can check code: if they do not know what to check, they cannot find bugs. This paper presents a novel framework based on factor graphs for automatically inferring specifications directly from programs. The key st ..."
Abstract - Cited by 44 (0 self) - Add to MetaCart
Automatic tools for finding software errors require a set of specifications before they can check code: if they do not know what to check, they cannot find bugs. This paper presents a novel framework based on factor graphs for automatically inferring specifications directly from programs. The key strength of the approach is that it can incorporate many disparate sources of evidence, allowing us to squeeze significantly more information from our observations than previously published techniques. We illustrate the strengths of our approach by applying it to the problem of inferring what functions in C programs allocate and release resources. We evaluated its effectiveness on five codebases: SDL, OpenSSH, GIMP, and the OS kernels for Linux and Mac OS X (XNU). For each codebase, starting with zero initially provided annotations, we observed an inferred annotation accuracy of 80-90%, with often near perfect accuracy for functions called as little as five times. Many of the inferred allocator and deallocator functions are functions for which we both lack the implementation and are rarely called — in some cases functions with at most one or two callsites. Finally, with the inferred annotations we quickly found both missing and incorrect properties in a specification used by a commercial static bug-finding tool. 1

Is bottom-up attention useful for object recognition

by Ueli Rutishauser, Dirk Walther, Christof Koch, Pietro Perona - In IEEE Conference on Computer Vision and Pattern Recognition (CVPR , 2004
"... A key problem in learning multiple objects from unlabeled images is that it is a priori impossible to tell which part of the image corresponds to each individual object, and which part is irrelevant clutter which is not associated to the objects. We investigate empirically to what extent pure bottom ..."
Abstract - Cited by 41 (5 self) - Add to MetaCart
A key problem in learning multiple objects from unlabeled images is that it is a priori impossible to tell which part of the image corresponds to each individual object, and which part is irrelevant clutter which is not associated to the objects. We investigate empirically to what extent pure bottom-up attention can extract useful information about the location, size and shape of objects from images and demonstrate how this information can be utilized to enable unsupervised learning of objects from unlabeled images. Our experiments demonstrate that the proposed approach to using bottom-up attention is indeed useful for a variety of applications. 1.

Improving Accuracy and Cost of Two-Class and Multi-Class Probabilistic Classifiers Using ROC Curves

by Nicolas Lachiche, Peter Flach - ICML-2003 , 2003
"... The probability estimates of a naive Bayes classifier are inaccurate if some of its underlying independence assumptions are violated. The decision criterion for using these estimates for classification therefore has to be learned from the data. This ..."
Abstract - Cited by 33 (3 self) - Add to MetaCart
The probability estimates of a naive Bayes classifier are inaccurate if some of its underlying independence assumptions are violated. The decision criterion for using these estimates for classification therefore has to be learned from the data. This

Properties and Benefits of Calibrated Classifiers

by Ira Cohen, Moises Goldszmidt - in 8th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD , 2004
"... A calibrated classifier provides reliable estimates of the true probability that each test sample is a member of the class of interest. ..."
Abstract - Cited by 22 (4 self) - Add to MetaCart
A calibrated classifier provides reliable estimates of the true probability that each test sample is a member of the class of interest.

Naive Bayesian Classification of Structured Data

by Peter A. Flach, Nicolas Lachiche , 2003
"... In this paper we present 1BC and 1BC2, two systems that perform naive Bayesian classification of structured individuals. The approach of 1BC is to project the individuals along first-order features. These features are built from the individual using structural predicates referring to related objects ..."
Abstract - Cited by 18 (0 self) - Add to MetaCart
In this paper we present 1BC and 1BC2, two systems that perform naive Bayesian classification of structured individuals. The approach of 1BC is to project the individuals along first-order features. These features are built from the individual using structural predicates referring to related objects (e.g. atoms within molecules), and properties applying to the individual or one or several of its related objects (e.g. a bond between two atoms). We describe an individual in terms of elementary features consisting of zero or more structural predicates and one property; these features are treated as conditionally independent in the spirit of the naive Bayes assumption. 1BC2 represents an alternative first-order upgrade to the naive Bayesian classifier by considering probability distributions over structured objects (e.g., a molecule as a set of atoms), and estimating those distributions from the probabilities of its elements (which are assumed to be independent). We present a unifying view on both systems in which 1BC works in language space, and 1BC2 works in individual space. We also present a new, efficient recursive algorithm improving upon the original propositionalisation approach of 1BC. Both systems have been implemented in the context of the first-order descriptive learner Tertius, and we investigate the differences between the two systems both in computational terms and on artificially generated data. Finally, we describe a range of experiments on ILP benchmark data sets demonstrating the viability of our approach.

Selective visual attention enables learning and recognition of multiple objects in cluttered scenes

by Dirk Walther, Ueli Rutishauser, Christof Koch, Pietro Perona - Computer Vision and Image Understanding , 2005
"... multiple objects in cluttered scenes ..."
Abstract - Cited by 17 (0 self) - Add to MetaCart
multiple objects in cluttered scenes

Regression Error Characteristic CurVes

by Jinbo Bi, Kristin P. Bennett - Proceedings of the 20th International Conference on Machine Learning , 2003
"... Receiver Operating Characteristic (ROC) curves provide a powerful tool for visualizing and comparing classification results. Regression Error Characteristic (REC) curves generalize ROC curves to regression. REC curves plot the error tolerance on the xaxis versus the percentage of points predicted wi ..."
Abstract - Cited by 16 (0 self) - Add to MetaCart
Receiver Operating Characteristic (ROC) curves provide a powerful tool for visualizing and comparing classification results. Regression Error Characteristic (REC) curves generalize ROC curves to regression. REC curves plot the error tolerance on the xaxis versus the percentage of points predicted within the tolerance on the y-axis. The resulting curve estimates the cumulative distribution function of the error. The REC curve visually presents commonly-used statistics. The area-over-the-curve (AOC) is a biased estimate of the expected error. The R 2 value can be estimated using the ratio of the AOC for a given model to the AOC for the null model. Users can quickly assess the relative merits of many regression functions by examining the relative position of their REC curves. The shape of the curve reveals additional information that can be used to guide modeling. 1.

A comparative study of real-valued negative selection to statistical anomaly detection techniques

by Thomas Stibor, Jonathan Timmis, Claudia Eckert - Proceedings of the 4th International Conference on Artificial Immune Systems, volume 3627 of LNCS , 2005
"... Abstract. The (randomized) real-valued negative selection algorithm is an anomaly detection approach, inspired by the negative selection immune system principle. The algorithm was proposed to overcome scaling problems inherent in the hamming shape-space negative selection algorithm. In this paper, w ..."
Abstract - Cited by 15 (8 self) - Add to MetaCart
Abstract. The (randomized) real-valued negative selection algorithm is an anomaly detection approach, inspired by the negative selection immune system principle. The algorithm was proposed to overcome scaling problems inherent in the hamming shape-space negative selection algorithm. In this paper, we investigate termination behavior of the realvalued negative selection algorithm with variable-sized detectors on an artificial data set. We then undertake an analysis and comparison of the classification performance on the high-dimensional KDD data set of the real-valued negative selection, a real-valued positive selection and statistical anomaly detection techniques. Results reveal that in terms of detection rate, real-valued negative selection with variable-sized detectors is not competitive to statistical anomaly detection techniques on the KDD data set. In addition, we suggest that the termination guarantee of the real-valued negative selection with variable-sized detectors is very sensitive to several parameters. 1
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University