Results 1  10
of
207
Large scale multiple kernel learning
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2006
"... While classical kernelbased learning algorithms are based on a single kernel, in practice it is often desirable to use multiple kernels. Lanckriet et al. (2004) considered conic combinations of kernel matrices for classification, leading to a convex quadratically constrained quadratic program. We s ..."
Abstract

Cited by 340 (20 self)
 Add to MetaCart
While classical kernelbased learning algorithms are based on a single kernel, in practice it is often desirable to use multiple kernels. Lanckriet et al. (2004) considered conic combinations of kernel matrices for classification, leading to a convex quadratically constrained quadratic program. We show that it can be rewritten as a semiinfinite linear program that can be efficiently solved by recycling the standard SVM implementations. Moreover, we generalize the formulation and our method to a larger class of problems, including regression and oneclass classification. Experimental results show that the proposed algorithm works for hundred thousands of examples or hundreds of kernels to be combined, and helps for automatic model selection, improving the interpretability of the learning result. In a second part we discuss general speed up mechanism for SVMs, especially when used with sparse feature maps as appear for string kernels, allowing us to train a string kernel SVM on a 10 million realworld splice data set from computational biology. We integrated multiple kernel learning in our machine learning toolbox SHOGUN for which the source code is publicly available at
Learning from imbalanced data
 IEEE Trans. on Knowledge and Data Engineering
, 2009
"... Abstract—With the continuous expansion of data availability in many largescale, complex, and networked systems, such as surveillance, security, Internet, and finance, it becomes critical to advance the fundamental understanding of knowledge discovery and analysis from raw data to support decisionm ..."
Abstract

Cited by 257 (6 self)
 Add to MetaCart
(Show Context)
Abstract—With the continuous expansion of data availability in many largescale, complex, and networked systems, such as surveillance, security, Internet, and finance, it becomes critical to advance the fundamental understanding of knowledge discovery and analysis from raw data to support decisionmaking processes. Although existing knowledge discovery and data engineering techniques have shown great success in many realworld applications, the problem of learning from imbalanced data (the imbalanced learning problem) is a relatively new challenge that has attracted growing attention from both academia and industry. The imbalanced learning problem is concerned with the performance of learning algorithms in the presence of underrepresented data and severe class distribution skews. Due to the inherent complex characteristics of imbalanced data sets, learning from such data requires new understandings, principles, algorithms, and tools to transform vast amounts of raw data efficiently into information and knowledge representation. In this paper, we provide a comprehensive review of the development of research in learning from imbalanced data. Our focus is to provide a critical review of the nature of the problem, the stateoftheart technologies, and the current assessment metrics used to evaluate learning performance under the imbalanced learning scenario. Furthermore, in order to stimulate future research in this field, we also highlight the major opportunities and challenges, as well as potential important research directions for learning from imbalanced data. Index Terms—Imbalanced learning, classification, sampling methods, costsensitive learning, kernelbased learning, active learning, assessment metrics. Ç
Is bottomup attention useful for object recognition
 In IEEE Conference on Computer Vision and Pattern Recognition (CVPR
, 2004
"... A key problem in learning multiple objects from unlabeled images is that it is a priori impossible to tell which part of the image corresponds to each individual object, and which part is irrelevant clutter which is not associated to the objects. We investigate empirically to what extent pure bottom ..."
Abstract

Cited by 121 (7 self)
 Add to MetaCart
(Show Context)
A key problem in learning multiple objects from unlabeled images is that it is a priori impossible to tell which part of the image corresponds to each individual object, and which part is irrelevant clutter which is not associated to the objects. We investigate empirically to what extent pure bottomup attention can extract useful information about the location, size and shape of objects from images and demonstrate how this information can be utilized to enable unsupervised learning of objects from unlabeled images. Our experiments demonstrate that the proposed approach to using bottomup attention is indeed useful for a variety of applications. 1.
A Simple Relational Classifier
 Proceedings of the Second Workshop on MultiRelational Data Mining (MRDM2003) at KDD2003
, 2003
"... We analyze a Relational Neighbor (RN) classifier, a simple relational predictive model that predicts only based on class labels of related neighbors, using no learning and no inherent attributes. We show that it performs surprisingly well by comparing it to more complex models such as Probabilist ..."
Abstract

Cited by 110 (12 self)
 Add to MetaCart
(Show Context)
We analyze a Relational Neighbor (RN) classifier, a simple relational predictive model that predicts only based on class labels of related neighbors, using no learning and no inherent attributes. We show that it performs surprisingly well by comparing it to more complex models such as Probabilistic Relational Models and Relational Probability Trees on three data sets from published work.
From uncertainty to belief: Inferring the specification within
 In ”Proceedings of the Seventh Symposium on Operating Systems Design and Implemetation
, 2006
"... Automatic tools for finding software errors require a set of specifications before they can check code: if they do not know what to check, they cannot find bugs. This paper presents a novel framework based on factor graphs for automatically inferring specifications directly from programs. The key st ..."
Abstract

Cited by 82 (0 self)
 Add to MetaCart
(Show Context)
Automatic tools for finding software errors require a set of specifications before they can check code: if they do not know what to check, they cannot find bugs. This paper presents a novel framework based on factor graphs for automatically inferring specifications directly from programs. The key strength of the approach is that it can incorporate many disparate sources of evidence, allowing us to squeeze significantly more information from our observations than previously published techniques. We illustrate the strengths of our approach by applying it to the problem of inferring what functions in C programs allocate and release resources. We evaluated its effectiveness on five codebases: SDL, OpenSSH, GIMP, and the OS kernels for Linux and Mac OS X (XNU). For each codebase, starting with zero initially provided annotations, we observed an inferred annotation accuracy of 8090%, with often near perfect accuracy for functions called as little as five times. Many of the inferred allocator and deallocator functions are functions for which we both lack the implementation and are rarely called — in some cases functions with at most one or two callsites. Finally, with the inferred annotations we quickly found both missing and incorrect properties in a specification used by a commercial static bugfinding tool. 1
Selective visual attention enables learning and recognition of multiple objects in cluttered scenes
 Computer Vision and Image Understanding
, 2005
"... multiple objects in cluttered scenes ..."
(Show Context)
Improving Accuracy and Cost of TwoClass and MultiClass Probabilistic Classifiers Using ROC Curves
 ICML2003
, 2003
"... The probability estimates of a naive Bayes classifier are inaccurate if some of its underlying independence assumptions are violated. The decision criterion for using these estimates for classification therefore has to be learned from the data. This ..."
Abstract

Cited by 54 (6 self)
 Add to MetaCart
The probability estimates of a naive Bayes classifier are inaccurate if some of its underlying independence assumptions are violated. The decision criterion for using these estimates for classification therefore has to be learned from the data. This
Assessing semantic similarity measures for the characterization of human regulatory pathways
 Bioinformatics
, 2006
"... doi:10.1093/bioinformatics/btl042 ..."
(Show Context)
Optimized cutting plane algorithm for support vector machines
 IN ICML
, 2008
"... We have developed a new Linear Support Vector Machine (SVM) training algorithm called OCAS. Its computational effort scales linearly with the sample size. In an extensive empirical evaluation OCAS significantly outperforms current state of the art SVM solvers, like SVM light, SVM perf and BMRM, achi ..."
Abstract

Cited by 43 (3 self)
 Add to MetaCart
We have developed a new Linear Support Vector Machine (SVM) training algorithm called OCAS. Its computational effort scales linearly with the sample size. In an extensive empirical evaluation OCAS significantly outperforms current state of the art SVM solvers, like SVM light, SVM perf and BMRM, achieving speedups of over 1,000 on some datasets over SVM light and 20 over SVM perf, while obtaining the same precise Support Vector solution. OCAS even in the early optimization steps shows often faster convergence than the so far in this domain prevailing approximative methods SGD and Pegasos. Effectively parallelizing OCAS we were able to train on a dataset of size 15 million examples (itself about 32GB in size) in just 671 seconds — a competing string kernel SVM required 97,484 seconds to train on 10 million examples subsampled from this dataset.
Learning causal Bayesian network structures from experimental data
, 2006
"... We propose a method for the computational inference of directed acyclic graphical structures given data from experimental interventions. Orderspace MCMC, equienergy sampling, importance weighting and streambased computation are combined to create a fast algorithm for learning causal Bayesian ne ..."
Abstract

Cited by 37 (1 self)
 Add to MetaCart
(Show Context)
We propose a method for the computational inference of directed acyclic graphical structures given data from experimental interventions. Orderspace MCMC, equienergy sampling, importance weighting and streambased computation are combined to create a fast algorithm for learning causal Bayesian network structures.