Results 1 -
6 of
6
Second Order Cone Programming Approaches for Handling Missing and Uncertain Data
- JOURNAL OF MACHINE LEARNING RESEARCH
, 2006
"... We propose a novel second order cone programming formulation for designing robust classifiers which can handle uncertainty in observations. Similar formulations are also derived for designing regression functions which are robust to uncertainties in the regression setting. The proposed formulations ..."
Abstract
-
Cited by 22 (6 self)
- Add to MetaCart
We propose a novel second order cone programming formulation for designing robust classifiers which can handle uncertainty in observations. Similar formulations are also derived for designing regression functions which are robust to uncertainties in the regression setting. The proposed formulations are independent of the underlying distribution, requiring only the existence of second order moments. These formulations are then specialized to the case of missing values in observations for both classification and regression problems. Experiments show that the proposed formulations outperform imputation.
Selected topics in robust convex optimization
- Math. Prog. B, this issue
, 2007
"... Abstract Robust Optimization is a rapidly developing methodology for handling optimization problems affected by non-stochastic “uncertain-butbounded” data perturbations. In this paper, we overview several selected topics in this popular area, specifically, (1) recent extensions of the basic concept ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
Abstract Robust Optimization is a rapidly developing methodology for handling optimization problems affected by non-stochastic “uncertain-butbounded” data perturbations. In this paper, we overview several selected topics in this popular area, specifically, (1) recent extensions of the basic concept of robust counterpart of an optimization problem with uncertain data, (2) tractability of robust counterparts, (3) links between RO and traditional chance constrained settings of problems with stochastic data, and (4) a novel generic application of the RO methodology in Robust Linear Control. Keywords optimization under uncertainty · robust optimization · convex programming · chance constraints · robust linear control
Coupling Feature Selection and Machine Learning Methods for Navigational Query Identification
, 2006
"... It is important yet hard to identify navigational queries in Web search due to a lack of sufficient information in Web queries, which are typically very short. In this paper we study several machine learning methods, including naive Bayes model, maximum entropy model, support vector machine (SVM), a ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
It is important yet hard to identify navigational queries in Web search due to a lack of sufficient information in Web queries, which are typically very short. In this paper we study several machine learning methods, including naive Bayes model, maximum entropy model, support vector machine (SVM), and stochastic gradient boosting tree (SGBT), for navigational query identification in Web search. To boost the performance of these machine techniques, we exploit several feature selection methods and propose coupling feature selection with classification approaches to achieve the best performance. Different from most prior work that uses a small number of features, in this paper, we study the problem of identifying navigational queries with thousands of available features, extracted from major commercial search engine results, Web search user click data, query log, and the whole Web’s relational content. A multi-level feature extraction system is constructed. Our results on real search data show that 1) Among all the features we tested, user click distribution features are the most important set of features for identifying navigational queries. 2) In order to achieve good performance, machine learning approaches have to be coupled with good feature selection methods. We find that gradient boosting tree, coupled with linear SVM feature selection is most effective. 3) With carefully coupled feature selection and classification approaches, navigational queries can be accurately identified with 88.1 % F1 score, which is 33 % error rate reduction compared to the best uncoupled system, and 40 % error rate reduction compared to a well tuned system without feature selection.
Robustness, Risk & Regularization in SVMs Robustness, Risk, and Regularization in Support Vector Machines
"... We consider two new formulations for classification problems in the spirit of support vector machines based on robust optimization. Our formulations are designed to build in protection to noise and control overfitting, but without being overly conservative. Our first formulation allows the noise bet ..."
Abstract
- Add to MetaCart
We consider two new formulations for classification problems in the spirit of support vector machines based on robust optimization. Our formulations are designed to build in protection to noise and control overfitting, but without being overly conservative. Our first formulation allows the noise between different samples to be correlated. We show that the standard norm-regularized support vector machine classifier is a solution to a special case of our first formulation, thus providing an explicit link between regularization and robustness in pattern classification. Our second formulation is based on a softer version of robust optimization called comprehensive robustness. We show that this formulation is equivalent to regularization by any arbitrary convex regularizer, thus extending our first equivalence result. Moreover, we explain how the connection of comprehensive robustness to convex risk-measures can be used to design risk-measure constrained classifiers with robustness to the input distribution. Our formulations result in convex optimization problems that can be easily solved. Finally, we provide some empirical results that show the promise of comprehensive robust classifiers. Keywords:
Robustness and Regularization of SVMs Robustness and Regularization of Support Vector Machines
"... We consider regularized support vector machines (SVMs) and show that they are precisely equivalent to a new robust optimization formulation. We show that this equivalence of robust optimization and regularization has implications for both algorithms, and analysis. In terms of algorithms, the equival ..."
Abstract
- Add to MetaCart
We consider regularized support vector machines (SVMs) and show that they are precisely equivalent to a new robust optimization formulation. We show that this equivalence of robust optimization and regularization has implications for both algorithms, and analysis. In terms of algorithms, the equivalence suggests more general SVM-like algorithms for classification that explicitly build in protection to noise, and at the same time control overfitting. On the analysis front, the equivalence of robustness and regularization, provides a robust optimization interpretation for the success of regularized SVMs. We use the this new robustness interpretation of SVMs to give a new proof of consistency of (kernelized) SVMs, thus establishing robustness as the reason regularized SVMs generalize well.
CDI TYPE I: A Communications Theory Approach to Morphogenesis and Architecture Maintenance
"... The assertion that biological systems are communication networks would draw no rebuke from biologists – the terms signaling, communication, and network are deeply embedded parts of the biology parlance. However, the more profound meanings of information and communication are often overlooked when co ..."
Abstract
- Add to MetaCart
The assertion that biological systems are communication networks would draw no rebuke from biologists – the terms signaling, communication, and network are deeply embedded parts of the biology parlance. However, the more profound meanings of information and communication are often overlooked when considering biological systems. Information can be quantified, its flow can be measured and tight bounds exist for its representation and conveyance between transmitters and receivers in a variety of settings. Furthermore, communications theory is about efficient communication where energy is at a premium – as is often the case in organisms. But perhaps most important, information theory allows mechanism-blind bounds on decisions and information flow. That is, the physics of a system allows determination of limits that any method of information description, delivery or processing must obey. Thus, rigorous application of communication theory to complex multi-cellular biological systems seems both attractive and obvious as an organizing principle – a way to tease order from the myriad engineering solutions that comprise biological systems. Likewise, study of biological systems – engineering solutions evolved over eons – might yield new communication and computation theory. Yet so far, a communications-theoretic approach to multi-cellular biology has

