## Algebraic statistical models

Venue: | Statistica Sinica |

Citations: | 15 - 4 self |

### BibTeX

@ARTICLE{Drton_algebraicstatistical,

author = {Mathias Drton and Seth Sullivant},

title = {Algebraic statistical models},

journal = {Statistica Sinica},

year = {}

}

### OpenURL

### Abstract

Abstract: Many statistical models are algebraic in that they are defined in terms of polynomial constraints, or in terms of polynomial or rational parametrizations. The parameter spaces of such models are typically semi-algebraic subsets of the parameter space of a reference model with nice properties, such as for example a regular exponential family. This observation leads to the definition of an ‘algebraic exponential family’. This new definition provides a unified framework for the study of statistical models with algebraic structure. In this paper we review the ingredients to this definition and illustrate in examples how computational algebraic geometry can be used to solve problems arising in statistical inference in algebraic models. Key words and phrases: Algebraic statistics, computational algebraic geometry, exponential family, maximum likelihood estimation, model invariants, singularities. 1.

### Citations

3967 | Convex analysis - ROCKAFELLAR - 1970 |

1190 | Graphical models - Lauritzen - 1996 |

231 |
Algebraic Geometry
- Bochnak, Coste, et al.
- 1992
(Show Context)
Citation Context ...y set. In our introduction, parametrically specified statistical models were claimed to be algebraic statistical models. This non-trivial claim holds due to the famous Tarski-Seidenberg theorem (e.g. =-=Bochnak et al., 1998-=-), which says that the image of a semi-algebraic set under any nice enough mapping is again a semi-algebraic set. To make this precise we need to define the class of mappings of interest. Let ψ1 = f1/... |

198 | B.Sturmfels. Algebraic algorithms for sampling from conditional distributions
- Diaconis
- 1995
(Show Context)
Citation Context ...sed to study statistical models and inference problems. This use of computational algebraic geometry was initiated in work on exact tests of conditional independence hypotheses in contingency tables (=-=Diaconis and Sturmfels, 1998-=-). Another line of work in experimental design led to the monograph by Pistone et al. (2001). ‘Algebraic statistics’, the buzz word in the titles of this monograph and the more recent book by Pachter ... |

187 | Information and exponential families in statistical theory - Barndorff-Nielsen - 1978 |

175 |
Fundamentals of Statistical Exponential Families with
- Brown
- 1986
(Show Context)
Citation Context ...ular exponential family is unique and if the same family is represented using two different canonical sufficient statistics then those two statistics are non-singular affine transforms of each other (=-=Brown, 1986-=-, Thm. 1.9). 2.1. Examples Regular exponential families comprise families of discrete distributions, which were the subject of much of the work on algebraic statistics. Example 4 (Discrete data). Let ... |

86 | Real algebraic and semi-algebraic sets - Benedetti, Risler - 1990 |

82 | Geomtrical Foundations of Asymptotic Inference - Kass, Vos - 1997 |

80 | Algebraic Statistics for Computational Biology - Pachter, Sturmfels - 2005 |

63 | Algebraic geometry of Bayesian networks
- Garcia, Stillman, et al.
(Show Context)
Citation Context ...fels and Sullivant, 2005), maximum likelihood estimation under multinomial sampling (Catanese et al., 2006; Ho¸sten et al., 2005), reliability theory (Giglio and Wynn, 2004), and Bayesian networks 1(=-=Garcia et al., 2005-=-). A special issue of the Journal of Symbolic Computation emphasizing the algebraic side emerged following the 2003 Workshop on Computational Algebraic Statistics at the American Institute of Mathemat... |

59 | Stratified exponential families: graphical models and model selection. The Annals of statistics 29:505–529
- Geiger, Heckerman, et al.
- 2001
(Show Context)
Citation Context ...oint union of the two one-dimensional smooth manifolds obtained by taking µ1 < 0 and µ1 > 0, and the zero-dimensional smooth manifold given by the origin. These manifolds form a stratification of C1 (=-=Geiger et al., 2001-=-, p. 513), and thus the model PC1 constitutes a stratified exponential family. In Figure 1, we plot three of the sets √ n C1 for the choices n = 100,100 2 ,100 3 . The range of the plot is restricted ... |

56 | Toric ideals of phylogenetic invariants
- Sturmfels, Sullivant
- 2005
(Show Context)
Citation Context ...tics has considered contingency table analysis (Aoki and Takemura, 2005; Dobra and Sullivant, 2004; Takemura and Aoki, 2005), phylogenetic tree models (Allman and Rhodes, 2003; Eriksson et al., 2005; =-=Sturmfels and Sullivant, 2005-=-), maximum likelihood estimation under multinomial sampling (Catanese et al., 2006; Ho¸sten et al., 2005), reliability theory (Giglio and Wynn, 2004), and Bayesian networks 1(Garcia et al., 2005). A ... |

42 |
Phylogenetic invariants for the general Markov model of sequence mutation
- Allman, Rhodes
- 2003
(Show Context)
Citation Context ...ms a part. Other recent work in algebraic statistics has considered contingency table analysis (Aoki and Takemura, 2005; Dobra and Sullivant, 2004; Takemura and Aoki, 2005), phylogenetic tree models (=-=Allman and Rhodes, 2003-=-; Eriksson et al., 2005; Sturmfels and Sullivant, 2005), maximum likelihood estimation under multinomial sampling (Catanese et al., 2006; Ho¸sten et al., 2005), reliability theory (Giglio and Wynn, 20... |

39 | On the toric algebra of graphical models
- Geiger, Meek, et al.
- 2006
(Show Context)
Citation Context ...ariables for discrete or jointly Gaussian random variables. Note that some work in algebraic statistics has focused on discrete distributions corresponding to the boundary of the probability simplex (=-=Geiger et al., 2006-=-). These distributions can be included in an extension of the regular exponential family corresponding to the interior of the probability simplex; see Barndorff-Nielsen (1978, pp. 154ff), Brown (1986,... |

36 |
Conditional independence for statistical operations
- Dawid
- 1980
(Show Context)
Citation Context ... reflects the well-known fact that [ X1 ⊥ X2 ∧ X1 ⊥ X2 | X3 ] ⇐⇒ [ X1 ⊥ (X2,X3) ∨ X2 ⊥ (X1,X3) ] , which holds for the multivariate normal distribution but also when X3 is a binary variable; compare (=-=Dawid, 1980-=-, Thm. 8.3). By Proposition 23 the singular locus of A is the intersection Asing = A13 ∩ A23 = {(µ,Σ) ∈ A | σ12 = σ13 = σ23 = 0}, which corresponds to diagonal covariance matrices Σ, or in other words... |

31 | Solving the likelihood equations - Hosten, Khetan, et al. |

23 | The maximum likelihood degree - Catanese, Hoşten, et al. |

20 | A Divide-and-conquer algorithm for generating Markov bases of multi-way tables
- Dobra, Sullivant
- 2000
(Show Context)
Citation Context ...s Institute led to the Statistica Sinica theme topic, of which this article forms a part. Other recent work in algebraic statistics has considered contingency table analysis (Aoki and Takemura, 2005; =-=Dobra and Sullivant, 2004-=-; Takemura and Aoki, 2005), phylogenetic tree models (Allman and Rhodes, 2003; Eriksson et al., 2005; Sturmfels and Sullivant, 2005), maximum likelihood estimation under multinomial sampling (Catanese... |

19 | Markov chain Monte Carlo exact tests for incomplete two-way contingency tables
- Aoki, Takemura
- 2005
(Show Context)
Citation Context ...ld at the Clay Mathematics Institute led to the Statistica Sinica theme topic, of which this article forms a part. Other recent work in algebraic statistics has considered contingency table analysis (=-=Aoki and Takemura, 2005-=-; Dobra and Sullivant, 2004; Takemura and Aoki, 2005), phylogenetic tree models (Allman and Rhodes, 2003; Eriksson et al., 2005; Sturmfels and Sullivant, 2005), maximum likelihood estimation under mul... |

15 |
Gröbner bases of ideals of minors of a symmetric matrix
- Conca
- 1994
(Show Context)
Citation Context ...the model is positive definite and, hence, each principal minor is invertible. The fact that the 14indicated ideal comprises all model invariants can be derived from a result in commutative algebra (=-=Conca, 1994-=-). In the discrete case, the polynomials we introduced in Example 16 generate the ideal of model invariants for the model induced by X1 ⊥ X2 | X3. For models induced by collections of independence sta... |

12 | H.P.: Algebraic Statistics - Pistone, Riccomagno, et al. - 2001 |

11 | What is a statistical model - McCullagh |

7 | Phylogenetic algebraic geometry
- Eriksson, Ranestad, et al.
- 2005
(Show Context)
Citation Context ...ork in algebraic statistics has considered contingency table analysis (Aoki and Takemura, 2005; Dobra and Sullivant, 2004; Takemura and Aoki, 2005), phylogenetic tree models (Allman and Rhodes, 2003; =-=Eriksson et al., 2005-=-; Sturmfels and Sullivant, 2005), maximum likelihood estimation under multinomial sampling (Catanese et al., 2006; Ho¸sten et al., 2005), reliability theory (Giglio and Wynn, 2004), and Bayesian netwo... |

5 | Stochastic factorizations, sandwiched simplices and the topology of the space of explanations - Mond, Smith, et al. |

3 | Closure of exponential families - Csiszár, Matúˇs - 2005 |

3 | Diaconis: Group representations in probability and statistics - unknown authors - 1988 |

3 | Ideals and the Scarf Complex for Coherent Systems in Reliability Theory, The Annals of Statistics 32
- Giglio, Monomial
- 2004
(Show Context)
Citation Context ...an and Rhodes, 2003; Eriksson et al., 2005; Sturmfels and Sullivant, 2005), maximum likelihood estimation under multinomial sampling (Catanese et al., 2006; Ho¸sten et al., 2005), reliability theory (=-=Giglio and Wynn, 2004-=-), and Bayesian networks 1(Garcia et al., 2005). A special issue of the Journal of Symbolic Computation emphasizing the algebraic side emerged following the 2003 Workshop on Computational Algebraic S... |

3 |
Distance-reducing Markov bases for sampling from a discrete sample space
- Takemura, Aoki
- 2005
(Show Context)
Citation Context ...istica Sinica theme topic, of which this article forms a part. Other recent work in algebraic statistics has considered contingency table analysis (Aoki and Takemura, 2005; Dobra and Sullivant, 2004; =-=Takemura and Aoki, 2005-=-), phylogenetic tree models (Allman and Rhodes, 2003; Eriksson et al., 2005; Sturmfels and Sullivant, 2005), maximum likelihood estimation under multinomial sampling (Catanese et al., 2006; Ho¸sten et... |

1 | Algebraic techniques for Gaussian models. In Prague Stochastics (Edited by M. Huˇsková and - Drton - 2006 |