Results 1  10
of
56
Statistical challenges with high dimensionality: feature selection in knowledge discovery
, 2006
"... ..."
(Show Context)
The identifiability of tree topology for phylogenetic models, including covarion and mixture models
, 2005
"... For a model of molecular evolution to be useful for phylogenetic inference, the topology of evolutionary trees must be identifiable. That is, from a joint distribution the model predicts, it must be possible to recover the tree parameter. We establish tree identifiability for a number of phylogene ..."
Abstract

Cited by 31 (12 self)
 Add to MetaCart
For a model of molecular evolution to be useful for phylogenetic inference, the topology of evolutionary trees must be identifiable. That is, from a joint distribution the model predicts, it must be possible to recover the tree parameter. We establish tree identifiability for a number of phylogenetic models, including a covarion model and a variety of mixture models with a limited number of classes. The proof is based on the introduction of a more general model, allowing more states at internal nodes of the tree than at leaves, and the study of the algebraic variety formed by the joint distributions to which it gives rise. Tree identifiability is first established for this general model through the use of certain phylogenetic invariants.
Algebraic statistical models
 Statistica Sinica
"... Abstract: Many statistical models are algebraic in that they are defined in terms of polynomial constraints, or in terms of polynomial or rational parametrizations. The parameter spaces of such models are typically semialgebraic subsets of the parameter space of a reference model with nice properti ..."
Abstract

Cited by 14 (4 self)
 Add to MetaCart
(Show Context)
Abstract: Many statistical models are algebraic in that they are defined in terms of polynomial constraints, or in terms of polynomial or rational parametrizations. The parameter spaces of such models are typically semialgebraic subsets of the parameter space of a reference model with nice properties, such as for example a regular exponential family. This observation leads to the definition of an ‘algebraic exponential family’. This new definition provides a unified framework for the study of statistical models with algebraic structure. In this paper we review the ingredients to this definition and illustrate in examples how computational algebraic geometry can be used to solve problems arising in statistical inference in algebraic models. Key words and phrases: Algebraic statistics, computational algebraic geometry, exponential family, maximum likelihood estimation, model invariants, singularities. 1.
Performance of a New Invariants Method on Homogeneous and Nonhomogeneous Quartet Trees
, 2006
"... ..."
Quartets and parameter recovery for the general Markov model of sequence mutation
 AMRX Appl. Math. Res. Express
, 2004
"... Methods of inference of the evolutionary history leading to currently extant species, or taxa, have been transformed in recent years by the ready availability of biological sequence data such as that from DNA. While many approaches to this inference problem have been developed, some of the methods ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
(Show Context)
Methods of inference of the evolutionary history leading to currently extant species, or taxa, have been transformed in recent years by the ready availability of biological sequence data such as that from DNA. While many approaches to this inference problem have been developed, some of the methods most appealing theoretically are so computa
The strand symmetric model
, 2005
"... This chapter is devoted to the study of strand symmetric Markov models on trees from the standpoint of algebraic statistics. By a strand symmetric Markov model, we mean one whose mutation structure reflects the symmetry induced by the doublestranded structure of DNA. In particular, a strand ..."
Abstract

Cited by 12 (7 self)
 Add to MetaCart
(Show Context)
This chapter is devoted to the study of strand symmetric Markov models on trees from the standpoint of algebraic statistics. By a strand symmetric Markov model, we mean one whose mutation structure reflects the symmetry induced by the doublestranded structure of DNA. In particular, a strand
Catalog of small trees
, 2005
"... This chapter is concerned with the description of the Small Trees website which can be found at the following web address: ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
(Show Context)
This chapter is concerned with the description of the Small Trees website which can be found at the following web address:
Conjunctive bayesian networks
 Bernoulli
, 2007
"... Conjunctive Bayesian networks (CBNs) are graphical models that describe the accumulation of events which are constrained in the order of their occurrence. A CBN is given by a partial order on a (finite) set of events. CBNs generalize the oncogenetic tree models of Desper et al. (1999) by allowing th ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
(Show Context)
Conjunctive Bayesian networks (CBNs) are graphical models that describe the accumulation of events which are constrained in the order of their occurrence. A CBN is given by a partial order on a (finite) set of events. CBNs generalize the oncogenetic tree models of Desper et al. (1999) by allowing the occurrence of an event to depend on more than one predecessor event. The present paper studies the statistical and algebraic properties of CBNs. We determine the maximum likelihood parameters and present a combinatorial solution to the model selection problem. Our method performs well on two datasets where the events are HIV mutations associated with drug resistance. Concluding with a study of the algebraic properties of CBNs, we show that CBNs are toric varieties after a coordinate transformation and that their ideals possess a quadratic Gröbner basis.