## Online Kernel PCA with entropic matrix updates (2007)

### Cached

### Download Links

- [www.cse.ucsc.edu]
- [imls.engr.oregonstate.edu]
- [www.cse.ucsc.edu]
- [www.machinelearning.org]
- DBLP

### Other Repositories/Bibliography

Venue: | In ICML |

Citations: | 6 - 3 self |

### BibTeX

@INPROCEEDINGS{Kuzmin07onlinekernel,

author = {Dima Kuzmin and Manfred K. Warmuth},

title = {Online Kernel PCA with entropic matrix updates},

booktitle = {In ICML},

year = {2007},

publisher = {ACM Press}

}

### Years of Citing Articles

### OpenURL

### Abstract

A number of updates for density matrices have been developed recently that are motivated by relative entropy minimization problems. The updates involve a softmin calculation based on matrix logs and matrix exponentials. We show that these updates can be kernelized. This is important because the bounds provable for these algorithms are logarithmic in the feature dimension (provided that the 2-norm of feature vectors is bounded by a constant). The main problem we focus on is the kernelization of an online PCA algorithm which belongs to this family of updates. 1.

### Citations

4017 |
Convex Optimization
- Boyd, Vandenberghe
- 2004
(Show Context)
Citation Context ... F t (W ) We would like to lower bound the per trial drop of the potential in terms of the loss of the algorithm. P t+1 − P t Online Kernel PCA with Entropic Matrix Updates (3) We use strong duality (=-=Boyd & Vandenberghe, 2004-=-) to compute the potential as a maximum of the dual problem. The purpose of going to the dual problem is only to help us analyze the algorithm, the algorithm itself is fully determined by the optimiza... |

2470 | A decision-theoretic generalization of online learning and an application to boosting
- Freund, Schapire
- 1997
(Show Context)
Citation Context ... ) P T +1 ≤ ∆(V , 1 n ) + ηtr(V CT +1 ), the bound follows. We now set η as a function of N, r, the maximum 2norm Q of the instances φ(xt) and an upper bound on the loss of the best r-projection ˆ L (=-=Freund & Schapire, 1997-=-): η = � N 2r ln r ln(1 + ˆL ) Q2 . (6) This tuning of η results in the following bound for the Online Kernel PCA Algorithm, that holds for all sequences with 2-norm of expanded instances φ(x t ) boun... |

136 |
Additive versus exponentiated gradient updates for linear prediction
- Kivinen, Warmuth
- 1997
(Show Context)
Citation Context ...m of the parameter vector is a linear combination of the instances. Family one is based on regularizing with the squared Euclidean distance and family two uses relative entropy as the regularization (=-=Kivinen & Warmuth, 1997-=-). Both families have their advantages. Family one can be kernelized, i.e. the instances x can be expanded into a feature vector φ(x) and the algorithm can be computed efficiently provided the kernel ... |

64 | Path kernels and multiplicative updates
- Takimoto, Warmuth
(Show Context)
Citation Context ...ns has generated great interest (See e.g. discussion in (Warmuth & Vishwanathan, 2005)). Only very few cases have been found for applying kernels with entropic updates when the instances are vectors (=-=Takimoto & Warmuth, 2003-=-). In this paper we show how the matrix generalizations of these updates can be kernelized when the instance matrices are of the form φ(xt)φ(xt) ⊤ . We chose the online PCA problem as our example prob... |

63 | A combinatorial, primal-dual approach to semidefinite programs
- Arora, Kale
- 2007
(Show Context)
Citation Context ...he author(s)/owner(s). generalized to the case when the instances are symmetric matrices X and the parameter is a density matrix (Tsuda et al., 2005; Warmuth & Kuzmin, 2006a; Warmuth & Kuzmin, 2006b; =-=Arora & Kale, 2007-=-). The regularization is now the quantum relative entropy for density matrices instead of the regular relative entropy for probability vectors, and the matrix logarithm of the density matrix parameter... |

60 | Tracking a small set of experts by mixing past posteriors
- Bousquet, Warmuth
(Show Context)
Citation Context ...ces. In this paper we only considered bounds compared to the best fixed off-line comparator. However these online algorithms can be adapted to the case when the comparator shifts with time (See e.g. (=-=Bousquet & Warmuth, 2002-=-)). A thorough experimental analysis of these extensions would be useful.sFigure 1. Our synthetic dataset is a sample of a 3-dimensional cone embedded in R 20 with added Gaussian noise. We depict the ... |

59 | Averaging expert predictions
- Kivinen, Warmuth
- 1999
(Show Context)
Citation Context ...o solve a PCA-like problem, where the approximation error is measured by something other than a 2-norm. Online learning techniques exist for dealing with very many different loss functions (see e.g. (=-=Kivinen & Warmuth, 1999-=-)), but only for the case of linear loss will the expectation of the losses equal the loss of the expected parameter. Thus, for other loss functions the algorithm might have good bounds, but it won’t ... |

55 | Tracking the best linear predictor
- Herbster, Warmuth
(Show Context)
Citation Context ...ctor that does not sum to one anymore. More details on the Decomposition Algorithm 4 are provided in (Warmuth & Kuzmin, 2006b) and a linear time implementation of the Capping Algorithm 5 is given in (=-=Herbster & Warmuth, 2001-=-). The Online PCA Algorithm 3 gives a summary of all the steps. Here a corner is a diagonal matrix with n − r of the diagonal elements set to 1 n−r and the rest set to zero. An important property of t... |

24 | Online variance minimization
- Warmuth, Kuzmin
- 2006
(Show Context)
Citation Context ...Learning, Corvallis, OR, 2007. Copyright 2007 by the author(s)/owner(s). generalized to the case when the instances are symmetric matrices X and the parameter is a density matrix (Tsuda et al., 2005; =-=Warmuth & Kuzmin, 2006-=-a; Warmuth & Kuzmin, 2006b; Arora & Kale, 2007). The regularization is now the quantum relative entropy for density matrices instead of the regular relative entropy for probability vectors, and the ma... |

18 | Randomized PCA algorithms with regret bounds that are logarithmic in the dimension
- Warmuth, Kuzmin
(Show Context)
Citation Context ...Learning, Corvallis, OR, 2007. Copyright 2007 by the author(s)/owner(s). generalized to the case when the instances are symmetric matrices X and the parameter is a density matrix (Tsuda et al., 2005; =-=Warmuth & Kuzmin, 2006-=-a; Warmuth & Kuzmin, 2006b; Arora & Kale, 2007). The regularization is now the quantum relative entropy for density matrices instead of the regular relative entropy for probability vectors, and the ma... |