Results 11  20
of
39
Tracking the Best Regressor
 In Proc. 11th Annu. Conf. on Comput. Learning Theory
, 1998
"... In most of the online learning research the total online loss of the algorithm is compared to the total loss of the best offline predictor u from a comparison class of predictors. We call such bounds static bounds. The interesting feature of these bounds is that they hold for an arbitrary sequenc ..."
Abstract

Cited by 18 (6 self)
 Add to MetaCart
In most of the online learning research the total online loss of the algorithm is compared to the total loss of the best offline predictor u from a comparison class of predictors. We call such bounds static bounds. The interesting feature of these bounds is that they hold for an arbitrary sequence of examples. Recently some work has been done where the comparison vector u t at each trial t is allowed to change with time, and the total online loss of the algorithm is compared to the sum of the losses of u t at each trial plus the total "cost" for shifting to successive comparison vectors. This is to model situations in which the examples change over time and different predictors from the comparison class are best for different segments of the sequence of examples. We call such bounds shifting bounds. Shifting bounds still hold for arbitrary sequences of examples and also for arbitrary partitions. The algorithm does not know the offline partition and the sequence of predictors that i...
Essential smoothness, essential strict convexity, and Legendre functions in Banach spaces
 COMM. CONTEMP. MATH
, 2001
"... The classical notions of essential smoothness, essential strict convexity, and Legendreness for convex functions are extended from Euclidean to Banach spaces. A pertinent duality theory is developed and several useful characterizations are given. The proofs rely on new results on the more subtle beh ..."
Abstract

Cited by 17 (12 self)
 Add to MetaCart
The classical notions of essential smoothness, essential strict convexity, and Legendreness for convex functions are extended from Euclidean to Banach spaces. A pertinent duality theory is developed and several useful characterizations are given. The proofs rely on new results on the more subtle behavior of subdifferentials and directional derivatives at boundary points of the domain. In weak Asplund spaces, a new formula allows the recovery of the subdifferential from nearby gradients. Finally, it is shown that every Legendre function on a reflexive Banach space is zone consistent, a fundamental property in the analysis of optimization algorithms based on Bregman distances. Numerous illustrating examples are provided.
Solving Multistage Stochastic Network Programs on Massively Parallel Computers
 Mathematical Programming
, 1995
"... Multistage stochastic programs are typically extremely large, and can be prohibitively expensive to solve on the computer. In this paper we develop an algorithm for multistage programs that integrates the primaldual rowaction framework with proximal minimization. The algorithm exploits the str ..."
Abstract

Cited by 12 (7 self)
 Add to MetaCart
Multistage stochastic programs are typically extremely large, and can be prohibitively expensive to solve on the computer. In this paper we develop an algorithm for multistage programs that integrates the primaldual rowaction framework with proximal minimization. The algorithm exploits the structure of stochastic programs with network recourse, using a suitable problem formulation based on split variables, to decompose the solution into a large number of simple operations. It is therefore possible to use massively parallel computers to solve large instances of these problems. The algorithm is implemented on a Connection Machine CM2 with up to 32K processors. We solve stochastic programs from an application from the insurance industry, as well as random problems, with up to 9 stages, and with up to 16392 scenarios, where the deterministic equivalent programs have a half million constraints and 1.3 million variables. Research partially supported by NSF grants CCR910404...
Current Trends in Stochastic Programming Computation and Applications
, 1995
"... While decisions frequently have uncertain consequences, optimal decision models often replace those uncertainties with averages or best estimates. Limited computational capability may have motivated this practice in the past. Recent computational advances have, however, greatly expanded the range of ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
While decisions frequently have uncertain consequences, optimal decision models often replace those uncertainties with averages or best estimates. Limited computational capability may have motivated this practice in the past. Recent computational advances have, however, greatly expanded the range of stochastic programs, optimal decision models with explicit consideration of uncertainties. This paper describes basic methodology in stochastic programming, recent developments in computation, and some practical application examples.
Proximal point methods for quasiconvex and convex functions with Bregman distances on Hadamard manifolds
 J. Convex Anal
, 2009
"... This paper generalizes the proximal point method using Bregman distances to solve convex and quasiconvex optimization problems on noncompact Hadamard manifolds. We will proved that the sequence generated by our method is well defined and converges to an optimal solution of the problem. Also, we obta ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
This paper generalizes the proximal point method using Bregman distances to solve convex and quasiconvex optimization problems on noncompact Hadamard manifolds. We will proved that the sequence generated by our method is well defined and converges to an optimal solution of the problem. Also, we obtain the same convergence properties for the classical proximal method, applied to a class of quasiconvex problems. Finally, we give some examples of Bregman distances in nonEuclidean spaces.
Randomized Online PCA Algorithms with Regret Bounds that are Logarithmic in the Dimension
, 2007
"... We design an online algorithm for Principal Component Analysis. In each trial the current instance is centered and projected into a probabilistically chosen low dimensional subspace. The regret of our online algorithm, i.e. the total expected quadratic compression loss of the online algorithm minus ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
We design an online algorithm for Principal Component Analysis. In each trial the current instance is centered and projected into a probabilistically chosen low dimensional subspace. The regret of our online algorithm, i.e. the total expected quadratic compression loss of the online algorithm minus the total quadratic compression loss of the batch algorithm, is bounded by a term whose dependence on the dimension of the instances is only logarithmic. We first develop our methodology in the expert setting of online learning by giving an algorithm for learning as well as the best subset of experts of a certain size. This algorithm is then lifted to the matrix setting where the subsets of experts correspond to subspaces. The algorithm represents the uncertainty over the best subspace as a density matrix whose eigenvalues are bounded. The running time is O(n²) per trial, where n is the dimension of the instances.
Alternating Directions Methods for the Parallel Solution of LargeScale BlockStructured Optimization Problems
, 1994
"... Prompted by advances in computer technology and the increasing confidence of decision makers in largescale market models, practitioners of operations research are now tackling problems of increasing detail, complexity and size. This necessitates the development of new solution algorithms that explo ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
Prompted by advances in computer technology and the increasing confidence of decision makers in largescale market models, practitioners of operations research are now tackling problems of increasing detail, complexity and size. This necessitates the development of new solution algorithms that exploit problem structure as well as the properties of the target hardware, in order to minimize turnaround time and maximize model utilization. Many models in planning and scheduling exhibit a blockangular structure, that can represent spatial or temporal partial decomposability: decision variables can be broken down to largely independent blocks, that correspond to firstlevel decisions satisfying a subset of the constraints, which may represent a time period, or a geographical region, or a commodity. The blocks interact via coupling constraints related to secondlevel coordination of block decisions, such as shared resource allocation restrictions. In this thesis we construct three efficient decomposition algorithms for such blockangular problems. These algorithms belong to the family of alternating directions methods, and can be thought of as block GaussSeidel iterative schemes for an augmented Lagrangian, that exploit the block structure. Alternatively, they can be thought of as DouglasRachford schemes for calculating a zero of the maximal monotone subgradient operator. Our algorithms are of the "forkjoin" type, alternating a local and a global computation phase. In the local phase, decoupled optimization subproblems corresponding to blocks are solved. In the global phase, solution information is combined and a coordination problem is solved, the results of which are used in modifying the objective function of the subproblems. The algorithms are thus similar to priced...
Dykstra's algorithm with Bregman projections: a convergence proof
 Optimization
, 1998
"... Dykstra's algorithm and the method of cyclic Bregman projections are often employed to solve best approximation and convex feasiblity problems, which are fundamental in mathematics and the physical sciences. Censor and Reich very recently suggested a synthesis of these methods, Dykstra's algorithm w ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
Dykstra's algorithm and the method of cyclic Bregman projections are often employed to solve best approximation and convex feasiblity problems, which are fundamental in mathematics and the physical sciences. Censor and Reich very recently suggested a synthesis of these methods, Dykstra's algorithm with Bregman projections, to tackle a nonorthogonal best approximation problem. They obtained convergence when each constraint is a halfspace. It is shown here that this new algorithm works for general closed convex constraints; this complements Censor and Reich's result and relates to a framework by Tseng. The proof rests on Boyle and Dykstra's original work and on strong properties of Bregman distances corresponding to Legendre functions. Special cases and observations simplifying the implementation of the algorithm are also discussed. 1991 M.R. Subject Classication. Primary 49M; Secondary 41A29, 65J05, 90C25. Key words and phrases. Best approximation, Bregman distance, Bregman projecti...
Bregman Monotone Optimization Algorithms
, 2002
"... A broad class of optimization algorithms based on Bregman distances in Banach spaces is unified around the notion of Bregman monotonicity. A systematic investigation of this notion leads to a simpli ed analysis of numerous algorithms and to the development of a new class of parallel blockiterative ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
A broad class of optimization algorithms based on Bregman distances in Banach spaces is unified around the notion of Bregman monotonicity. A systematic investigation of this notion leads to a simpli ed analysis of numerous algorithms and to the development of a new class of parallel blockiterative surrogate Bregman projection schemes. Another key contribution is the introduction of a class of operators that is shown to be intrinsically tied to the notion of Bregman monotonicity and to include the operators commonly found in Bregman optimization methods. Special emphasis is placed on the viability of the algorithms and the importance of Legendre functions in this regard. Various applications are discussed.
A Practical General Approximation Criterion for Methods of Multipliers Based on Bregman Distances
, 2000
"... This paper demonstrates that for generalized methods of multipliers for convex programming based on Bregman distance kernels  including the classical quadratic method of multipliers  the minimization of the augmented Lagrangian can be truncated using a simple, generally implementable stopping ..."
Abstract

Cited by 6 (4 self)
 Add to MetaCart
This paper demonstrates that for generalized methods of multipliers for convex programming based on Bregman distance kernels  including the classical quadratic method of multipliers  the minimization of the augmented Lagrangian can be truncated using a simple, generally implementable stopping criterion based only on the norms of the primal iterate and the gradient (or a subgradient) of the augmented Lagrangian at that iterate. Previous results in this and related areas have required conditions that are much harder to verify, such as ffloptimality with respect to the augmented Lagrangian, or strong conditions on the convex program to be solved. Here, only existence of a KKT pair is required, and the convergence properties of the exact form of the method are preserved. The key new element in the analysis is the use of a full conjugate duality framework, as opposed to mainly examining the action of the method on the standard dual function of the convex program. An existence resul...