Results 11 - 20
of
31
Tracking the Best Regressor
- In Proc. 11th Annu. Conf. on Comput. Learning Theory
, 1998
"... In most of the on-line learning research the total on-line loss of the algorithm is compared to the total loss of the best off-line predictor u from a comparison class of predictors. We call such bounds static bounds. The interesting feature of these bounds is that they hold for an arbitrary sequenc ..."
Abstract
-
Cited by 17 (6 self)
- Add to MetaCart
In most of the on-line learning research the total on-line loss of the algorithm is compared to the total loss of the best off-line predictor u from a comparison class of predictors. We call such bounds static bounds. The interesting feature of these bounds is that they hold for an arbitrary sequence of examples. Recently some work has been done where the comparison vector u t at each trial t is allowed to change with time, and the total online loss of the algorithm is compared to the sum of the losses of u t at each trial plus the total "cost" for shifting to successive comparison vectors. This is to model situations in which the examples change over time and different predictors from the comparison class are best for different segments of the sequence of examples. We call such bounds shifting bounds. Shifting bounds still hold for arbitrary sequences of examples and also for arbitrary partitions. The algorithm does not know the offline partition and the sequence of predictors that i...
Essential smoothness, essential strict convexity, and Legendre functions in Banach spaces
- COMM. CONTEMP. MATH
, 2001
"... The classical notions of essential smoothness, essential strict convexity, and Legendreness for convex functions are extended from Euclidean to Banach spaces. A pertinent duality theory is developed and several useful characterizations are given. The proofs rely on new results on the more subtle beh ..."
Abstract
-
Cited by 13 (11 self)
- Add to MetaCart
The classical notions of essential smoothness, essential strict convexity, and Legendreness for convex functions are extended from Euclidean to Banach spaces. A pertinent duality theory is developed and several useful characterizations are given. The proofs rely on new results on the more subtle behavior of subdifferentials and directional derivatives at boundary points of the domain. In weak Asplund spaces, a new formula allows the recovery of the subdifferential from nearby gradients. Finally, it is shown that every Legendre function on a reflexive Banach space is zone consistent, a fundamental property in the analysis of optimization algorithms based on Bregman distances. Numerous illustrating examples are provided.
Solving Multistage Stochastic Network Programs on Massively Parallel Computers
- Mathematical Programming
, 1995
"... Multi-stage stochastic programs are typically extremely large, and can be prohibitively expensive to solve on the computer. In this paper we develop an algorithm for multistage programs that integrates the primal-dual row-action framework with proximal minimization. The algorithm exploits the str ..."
Abstract
-
Cited by 12 (7 self)
- Add to MetaCart
Multi-stage stochastic programs are typically extremely large, and can be prohibitively expensive to solve on the computer. In this paper we develop an algorithm for multistage programs that integrates the primal-dual row-action framework with proximal minimization. The algorithm exploits the structure of stochastic programs with network recourse, using a suitable problem formulation based on split variables, to decompose the solution into a large number of simple operations. It is therefore possible to use massively parallel computers to solve large instances of these problems. The algorithm is implemented on a Connection Machine CM--2 with up to 32K processors. We solve stochastic programs from an application from the insurance industry, as well as random problems, with up to 9 stages, and with up to 16392 scenarios, where the deterministic equivalent programs have a half million constraints and 1.3 million variables. Research partially supported by NSF grants CCR--910404...
Current Trends in Stochastic Programming Computation and Applications
, 1995
"... While decisions frequently have uncertain consequences, optimal decision models often replace those uncertainties with averages or best estimates. Limited computational capability may have motivated this practice in the past. Recent computational advances have, however, greatly expanded the range of ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
While decisions frequently have uncertain consequences, optimal decision models often replace those uncertainties with averages or best estimates. Limited computational capability may have motivated this practice in the past. Recent computational advances have, however, greatly expanded the range of stochastic programs, optimal decision models with explicit consideration of uncertainties. This paper describes basic methodology in stochastic programming, recent developments in computation, and some practical application examples.
Alternating Directions Methods for the Parallel Solution of Large-Scale Block-Structured Optimization Problems
, 1994
"... Prompted by advances in computer technology and the increasing confidence of decision makers in large-scale market models, practitioners of operations research are now tackling problems of increasing detail, complexity and size. This necessitates the development of new solution algorithms that explo ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Prompted by advances in computer technology and the increasing confidence of decision makers in large-scale market models, practitioners of operations research are now tackling problems of increasing detail, complexity and size. This necessitates the development of new solution algorithms that exploit problem structure as well as the properties of the target hardware, in order to minimize turnaround time and maximize model utilization. Many models in planning and scheduling exhibit a block-angular structure, that can represent spatial or temporal partial decomposability: decision variables can be broken down to largely independent blocks, that correspond to first-level decisions satisfying a subset of the constraints, which may represent a time period, or a geographical region, or a commodity. The blocks interact via coupling constraints related to second-level coordination of block decisions, such as shared resource allocation restrictions. In this thesis we construct three efficient decomposition algorithms for such block-angular problems. These algorithms belong to the family of alternating directions methods, and can be thought of as block Gauss-Seidel iterative schemes for an augmented Lagrangian, that exploit the block structure. Alternatively, they can be thought of as Douglas--Rachford schemes for calculating a zero of the maximal monotone subgradient operator. Our algorithms are of the "fork--join" type, alternating a local and a global computation phase. In the local phase, decoupled optimization subproblems corresponding to blocks are solved. In the global phase, solution information is combined and a coordination problem is solved, the results of which are used in modifying the objective function of the subproblems. The algorithms are thus similar to price-d...
Dykstra's algorithm with Bregman projections: a convergence proof
- Optimization
, 1998
"... Dykstra's algorithm and the method of cyclic Bregman projections are often employed to solve best approximation and convex feasiblity problems, which are fundamental in mathematics and the physical sciences. Censor and Reich very recently suggested a synthesis of these methods, Dykstra's algorithm w ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
Dykstra's algorithm and the method of cyclic Bregman projections are often employed to solve best approximation and convex feasiblity problems, which are fundamental in mathematics and the physical sciences. Censor and Reich very recently suggested a synthesis of these methods, Dykstra's algorithm with Bregman projections, to tackle a non-orthogonal best approximation problem. They obtained convergence when each constraint is a halfspace. It is shown here that this new algorithm works for general closed convex constraints; this complements Censor and Reich's result and relates to a framework by Tseng. The proof rests on Boyle and Dykstra's original work and on strong properties of Bregman distances corresponding to Legendre functions. Special cases and observations simplifying the implementation of the algorithm are also discussed. 1991 M.R. Subject Classication. Primary 49M; Secondary 41A29, 65J05, 90C25. Key words and phrases. Best approximation, Bregman distance, Bregman projecti...
Bregman Monotone Optimization Algorithms
, 2002
"... A broad class of optimization algorithms based on Bregman distances in Banach spaces is unified around the notion of Bregman monotonicity. A systematic investigation of this notion leads to a simpli ed analysis of numerous algorithms and to the development of a new class of parallel block-iterative ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
A broad class of optimization algorithms based on Bregman distances in Banach spaces is unified around the notion of Bregman monotonicity. A systematic investigation of this notion leads to a simpli ed analysis of numerous algorithms and to the development of a new class of parallel block-iterative surrogate Bregman projection schemes. Another key contribution is the introduction of a class of operators that is shown to be intrinsically tied to the notion of Bregman monotonicity and to include the operators commonly found in Bregman optimization methods. Special emphasis is placed on the viability of the algorithms and the importance of Legendre functions in this regard. Various applications are discussed.
Randomized Online PCA Algorithms with Regret Bounds that are Logarithmic in the Dimension
, 2007
"... We design an online algorithm for Principal Component Analysis. In each trial the current instance is centered and projected into a probabilistically chosen low dimensional subspace. The regret of our online algorithm, i.e. the total expected quadratic compression loss of the online algorithm minus ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
We design an online algorithm for Principal Component Analysis. In each trial the current instance is centered and projected into a probabilistically chosen low dimensional subspace. The regret of our online algorithm, i.e. the total expected quadratic compression loss of the online algorithm minus the total quadratic compression loss of the batch algorithm, is bounded by a term whose dependence on the dimension of the instances is only logarithmic. We first develop our methodology in the expert setting of online learning by giving an algorithm for learning as well as the best subset of experts of a certain size. This algorithm is then lifted to the matrix setting where the subsets of experts correspond to subspaces. The algorithm represents the uncertainty over the best subspace as a density matrix whose eigenvalues are bounded. The running time is O(n²) per trial, where n is the dimension of the instances.
A Practical General Approximation Criterion for Methods of Multipliers Based on Bregman Distances
, 2000
"... This paper demonstrates that for generalized methods of multipliers for convex programming based on Bregman distance kernels --- including the classical quadratic method of multipliers --- the minimization of the augmented Lagrangian can be truncated using a simple, generally implementable stopping ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
This paper demonstrates that for generalized methods of multipliers for convex programming based on Bregman distance kernels --- including the classical quadratic method of multipliers --- the minimization of the augmented Lagrangian can be truncated using a simple, generally implementable stopping criterion based only on the norms of the primal iterate and the gradient (or a subgradient) of the augmented Lagrangian at that iterate. Previous results in this and related areas have required conditions that are much harder to verify, such as ffl-optimality with respect to the augmented Lagrangian, or strong conditions on the convex program to be solved. Here, only existence of a KKT pair is required, and the convergence properties of the exact form of the method are preserved. The key new element in the analysis is the use of a full conjugate duality framework, as opposed to mainly examining the action of the method on the standard dual function of the convex program. An existence resul...
Iterating Bregman Retractions
"... The notion of a Bregman retraction of a closed convex set in Euclidean space is introduced. Bregman retractions include backward Bregman projections, forward Bregman projections, as well as their convex combinations, and are thus quite exible. The main result on iterating Bregman retractions unifies ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
The notion of a Bregman retraction of a closed convex set in Euclidean space is introduced. Bregman retractions include backward Bregman projections, forward Bregman projections, as well as their convex combinations, and are thus quite exible. The main result on iterating Bregman retractions unifies several convergence results on projection methods for solving convex feasibility problems. It is also used to construct new sequential and parallel algorithms.

