## Information geometry, the embedding principle, and document classification (2005)

Venue: | in Proceedings of the 2nd International Symposium on Information Geometry and its Applications |

Citations: | 7 - 0 self |

### BibTeX

@INPROCEEDINGS{Lebanon05informationgeometry,,

author = {Guy Lebanon},

title = {Information geometry, the embedding principle, and document classification},

booktitle = {in Proceedings of the 2nd International Symposium on Information Geometry and its Applications},

year = {2005}

}

### OpenURL

### Abstract

Abstract. High dimensional structured data such as text and images is often poorly understood and misrepresented in statistical modeling. Typical approaches to modeling such data involve, either explicitly or implicitly, arbitrary geometric assumptions. In this paper, we review a framework introduced by Lebanon and Lafferty that is based on Čencov’s theorem for obtaining a coherent geometry for data. The framework enables adaptation of popular models to the new geometry and in the context of text classification yields superior performance with respect to classification error rate on held out data. The framework demonstrates how information geometry may be applied to modeling high dimensional structured data and points at new directions for future research. 1.

### Citations

510 | Distance metric learning with application to clustering with side-information - Xing, Ng, et al. - 2002 |

277 |
Methods of Information Geometry
- Amari, Nagaoka
- 2000
(Show Context)
Citation Context ...the spherical law of cosines the margin d(x, ι−1 (S + m ∩ E)) may be efficiently approximated [9]. 2 Auto-parallelism may also be enforced with respect to other connections, for example α connections =-=[1]-=-. We concentrate here on the metric connection as in this case the motivation from Čencov’s theorem is strongest.sINFORMATION GEOMETRY, EMBEDDING, AND CLASSIFICATION 7 To apply the above to logistic r... |

87 | Diffusion kernels on statistical manifolds
- Lafferty, Lebanon
- 2005
(Show Context)
Citation Context ...pically a hard task, even for experts, to specify such a geometry. Alternatively, the geometry can be adapted based on known a data set, as in [7, 10] or on domain-dependent axiomatic arguments as in =-=[4, 5]-=-. In this paper we review a framework described in a series of papers by Lebanon and Lafferty [9, 4, 6] for obtaining domain-dependent geometry and adapting existing classification models for this geo... |

81 | Boosting and maximum likelihood for exponential models
- Lebanon, La®erty
- 2002
(Show Context)
Citation Context ...statement will be partially motivated in Section 3). 1 While SVM and boosting are, strictly speaking, not conditional distributions, we view them as non-normalized conditional models. See for example =-=[8]-=- for more details. 1s2 G. LEBANON The assumption of Euclidean geometry mentioned above is rather arbitrary. There is no reason for believe that word frequencies in documents or pixel brightness values... |

58 |
Statistical decision rules and optimal inference
- Cencov
- 1981
(Show Context)
Citation Context ... 4 concludes the paper with a brief discussion. 2. The Embedding Principle As mentioned in the introduction, we would like to avoid arbitrary assumptions on the geometry of the data. Čencov’s theorem =-=[3]-=- (see also its extensions in [2, 5]) provides a theoretical motivation for the use of the Fisher information metric on a manifold Θ of distributions. At first glance it is not clear how this can contr... |

11 |
An extended Čencov characterization of the information metric
- Campbell
- 1986
(Show Context)
Citation Context ...ief discussion. 2. The Embedding Principle As mentioned in the introduction, we would like to avoid arbitrary assumptions on the geometry of the data. Čencov’s theorem [3] (see also its extensions in =-=[2, 5]-=-) provides a theoretical motivation for the use of the Fisher information metric on a manifold Θ of distributions. At first glance it is not clear how this can contribute towards obtaining a well-moti... |

11 | Axiomatic geometry of conditional models - Lebanon |

9 | Riemannian Geometry and Statistical Machine Learning
- Lebanon
- 2005
(Show Context)
Citation Context ...e adapted based on known a data set, as in [7, 10] or on domain-dependent axiomatic arguments as in [4, 5]. In this paper we review a framework described in a series of papers by Lebanon and Lafferty =-=[9, 4, 6]-=- for obtaining domain-dependent geometry and adapting existing classification models for this geometry. Section 2 discusses the embedding principle for obtaining the geometry. Section 3 describes adap... |

5 | Hyperplane margin classifiers on the multinomial manifold
- Lebanon, Lafferty
- 2004
(Show Context)
Citation Context ...e adapted based on known a data set, as in [7, 10] or on domain-dependent axiomatic arguments as in [4, 5]. In this paper we review a framework described in a series of papers by Lebanon and Lafferty =-=[9, 4, 6]-=- for obtaining domain-dependent geometry and adapting existing classification models for this geometry. Section 2 discusses the embedding principle for obtaining the geometry. Section 3 describes adap... |

1 | Metric learning for text classification - Lebanon |