Results 11  20
of
374
Explicitly Representing Expected Cost: An Alternative to ROC Representation
 In Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
, 2000
"... This paper proposes an alternative to ROC representation, in which the expected cost of a classifier is represented explicitly. This expected cost representation maintains many of the advantages of ROC representation, but is easier to understand. It allows the experimenter to immediately see the ran ..."
Abstract

Cited by 83 (11 self)
 Add to MetaCart
(Show Context)
This paper proposes an alternative to ROC representation, in which the expected cost of a classifier is represented explicitly. This expected cost representation maintains many of the advantages of ROC representation, but is easier to understand. It allows the experimenter to immediately see the range of costs and class frequencies where a particular classifier is the best and quantitatively how much better it is than other classifiers. This paper demonstrates there is a point/line duality between the two representations. A point in ROC space representing a classifier becomes a line segment spanning the full range of costs and class frequencies. This duality produces equivalent operations in the two spaces, allowing most techniques used in ROC analysis to be readily reproduced in the cost space.
Tree induction vs. logistic regression: A learningcurve analysis
 CEDER WORKING PAPER #IS0102, STERN SCHOOL OF BUSINESS
, 2001
"... Tree induction and logistic regression are two standard, offtheshelf methods for building models for classi cation. We present a largescale experimental comparison of logistic regression and tree induction, assessing classification accuracy and the quality of rankings based on classmembership pr ..."
Abstract

Cited by 71 (16 self)
 Add to MetaCart
(Show Context)
Tree induction and logistic regression are two standard, offtheshelf methods for building models for classi cation. We present a largescale experimental comparison of logistic regression and tree induction, assessing classification accuracy and the quality of rankings based on classmembership probabilities. We use a learningcurve analysis to examine the relationship of these measures to the size of the training set. The results of the study show several remarkable things. (1) Contrary to prior observations, logistic regression does not generally outperform tree induction. (2) More specifically, and not surprisingly, logistic regression is better for smaller training sets and tree induction for larger data sets. Importantly, this often holds for training sets drawn from the same domain (i.e., the learning curves cross), so conclusions about inductionalgorithm superiority on a given domain must be based on an analysis of the learning curves. (3) Contrary to conventional wisdom, tree induction is effective atproducing probabilitybased rankings, although apparently comparatively less so foragiven training{set size than at making classifications. Finally, (4) the domains on which tree induction and logistic regression are ultimately preferable canbecharacterized surprisingly well by a simple measure of signaltonoise ratio.
Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals
 J. Comput. Biol
, 2004
"... ..."
Benchmarking AnomalyBased Detection Systems
, 2000
"... Anomaly detection is a key element of intrusiondetection and other detection systems in which perturbations of normal behavior suggest the presence of intentionally or unintentionally induced attacks, faults, defects, etc. Because most anomaly detectors are based on probabilistic algorithms that exp ..."
Abstract

Cited by 54 (6 self)
 Add to MetaCart
(Show Context)
Anomaly detection is a key element of intrusiondetection and other detection systems in which perturbations of normal behavior suggest the presence of intentionally or unintentionally induced attacks, faults, defects, etc. Because most anomaly detectors are based on probabilistic algorithms that exploit the intrinsic structure, or regularity, embedded in data logs, a fundamental question is whether or not such structure influences detection performance. If detector performance is indeed a function of environmental regularity, it would be critical to match detectors to environmental characteristics. In intrusiondetection settings, however, this is not done, possibly because such characteristics are not easily ascertained. This paper introduces a metric for characterizing structure in data environments, and tests the hypothesis that intrinsic structure influences probabilistic detection. In a series of experiments, an anomalydetection algorithm was applied to a benchmark suite of 165 c...
Cost curves: an improved method for visualizing classifier performance
 Machine Learning
, 2006
"... Abstract. This paper introduces cost curves, a graphical technique for visualizing the performance (error rate or expected cost) of 2class classifiers over the full range of possible class distributions and misclassification costs. Cost curves are shown to be superior to ROC curves for visualizing ..."
Abstract

Cited by 49 (7 self)
 Add to MetaCart
(Show Context)
Abstract. This paper introduces cost curves, a graphical technique for visualizing the performance (error rate or expected cost) of 2class classifiers over the full range of possible class distributions and misclassification costs. Cost curves are shown to be superior to ROC curves for visualizing classifier performance for most purposes. This is because they visually support several crucial types of performance assessment that cannot be done easily with ROC curves, such as showing confidence intervals on a classifier’s performance, and visualizing the statistical significance of the difference in performance of two classifiers. A software tool supporting all the cost curve analysis described in this paper is available from the authors.
Minimum cut model for spoken lecture segmentation
 In Proceedings of the Annual Meeting of the Association for Computational Linguistics (COLINGACL 2006
, 2006
"... We consider the task of unsupervised lecture segmentation. We formalize segmentation as a graphpartitioning task that optimizes the normalized cut criterion. Our approach moves beyond localized comparisons and takes into account longrange cohesion dependencies. Our results demonstrate that global a ..."
Abstract

Cited by 47 (8 self)
 Add to MetaCart
(Show Context)
We consider the task of unsupervised lecture segmentation. We formalize segmentation as a graphpartitioning task that optimizes the normalized cut criterion. Our approach moves beyond localized comparisons and takes into account longrange cohesion dependencies. Our results demonstrate that global analysis improves the segmentation accuracy and is robust in the presence of speech recognition errors. 1
SJ: BindN: a webbased tool for efficient prediction of DNA and RNA binding sites in amino acid sequences
 Nucleic Acids Res
"... BindN ..."
(Show Context)
ROC analysis of statistical methods used in functional MRI: Individual Subjects. NeuroImage 9
, 1999
"... The complicated structure of fMRI signals and associated noise sources make it difficult to assess the validity of various steps involved in the statistical analysis of brain activation. Most methods used for fMRI analysis assume that observations are independent and that the noise can be treated as ..."
Abstract

Cited by 44 (7 self)
 Add to MetaCart
(Show Context)
The complicated structure of fMRI signals and associated noise sources make it difficult to assess the validity of various steps involved in the statistical analysis of brain activation. Most methods used for fMRI analysis assume that observations are independent and that the noise can be treated as white gaussian noise. These assumptions are usually not true but it is difficult to assess how severely these assumptions are violated and what are their practical consequences. In this study a direct comparison is made between the power of various analytical methods used to detect activations, without reference to estimates of statistical significance. The statistics used in fMRI are treated as metrics designed to detect activations and are not interpreted probabilistically. The receiver operator characteristic (ROC) method is used to compare the efficacy of various steps in calculating an activation map in the study of a single subject based on optimizing the ratio of the number of detected activations to the number of falsepositive findings. The main findings are as follows: Preprocessing. The removal of intensity drifts and highpass filtering applied on the voxel timecourse level is beneficial to the efficacy of analysis. Temporal normalization of the global image intensity, smoothing in the temporal domain, and lowpass filtering do not improve power of analysis. Choices of statistics. the crosscorrelation coefficient and tstatistic, as well as nonparametric Mann–Whitney statistics, prove to be the most effective and are similar in performance, by our criterion. Task design. the proper design of task protocols is shown to be crucial. In an alternating block design the optimal block length is be approximately 18 s. Spatial clustering. an initial spatial smoothing of images is more efficient than cluster filtering of the statistical parametric activation maps. � 1999 Academic Press 1.
Using Rule Sets to Maximize ROC Performance
, 2001
"... Rules are commonly used for classification because they are modular, intelligible and easy to learn. Existing work in classification rule learning assumes the goal is to produce categorical classifications to maximize classification accuracy. Recent work in machine learning has pointed out the limit ..."
Abstract

Cited by 38 (3 self)
 Add to MetaCart
Rules are commonly used for classification because they are modular, intelligible and easy to learn. Existing work in classification rule learning assumes the goal is to produce categorical classifications to maximize classification accuracy. Recent work in machine learning has pointed out the limitations of classification accuracy: when class distributions are skewed, or error costs are unequal, an accuracy maximizing rule set can perform poorly. A more flexible use of a rule set is to produce instance scores indicating the likelihood that an instance belongs to a given class. With such an ability, we can apply rulesets effectively when distributions are skewed or error costs are unequal. This paper empirically investigates different strategies for evaluating rule sets when the goal is to maximize the scoring (ROC) performance.
Methods and Statistics for Combining Motif Match Scores
 Journal of Computational Biology
, 1998
"... Positionspecific scoring matrices are useful for representing and searching for protein sequence motifs. A sequence family can often be described by a group of one or more motifs, and an effective search must combine the scores for matching a sequence to each of the motifs in the group. We describe ..."
Abstract

Cited by 38 (3 self)
 Add to MetaCart
Positionspecific scoring matrices are useful for representing and searching for protein sequence motifs. A sequence family can often be described by a group of one or more motifs, and an effective search must combine the scores for matching a sequence to each of the motifs in the group. We describe three methods for combining match scores and estimating the statistical significance of the combined scores and evaluate the search quality (classification accuracy) and the accuracy of the estimate of statistical significance of each. The three methods are: 1) sum of scores, 2) sum of reduced variates, 3) product of score pvalues. We show that method 3) is superior to the other two methods in both regards, and that combining motif scores indeed gives better search accuracy. The mast sequence homology search algorithm utilizing the product of pvalues scoring method is available for interactive use and downloading at URL http://www.sdsc.edu/MEME. Keywords: protein sequence motifs, profiles,...