Results 1 - 10
of
23
Hierarchical Dirichlet processes
- Journal of the American Statistical Association
, 2004
"... program. The authors wish to acknowledge helpful discussions with Lancelot James and Jim Pitman and the referees for useful comments. 1 We consider problems involving groups of data, where each observation within a group is a draw from a mixture model, and where it is desirable to share mixture comp ..."
Abstract
-
Cited by 328 (44 self)
- Add to MetaCart
program. The authors wish to acknowledge helpful discussions with Lancelot James and Jim Pitman and the referees for useful comments. 1 We consider problems involving groups of data, where each observation within a group is a draw from a mixture model, and where it is desirable to share mixture components between groups. We assume that the number of mixture components is unknown a priori and is to be inferred from the data. In this setting it is natural to consider sets of Dirichlet processes, one for each group, where the well-known clustering property of the Dirichlet process provides a nonparametric prior for the number of mixture components within each group. Given our desire to tie the mixture models in the various groups, we consider a hierarchical model, specifically one in which the base measure for the child Dirichlet processes is itself distributed according to a Dirichlet process. Such a base measure being discrete, the child Dirichlet processes necessar-ily share atoms. Thus, as desired, the mixture models in the different groups necessarily share mixture components. We discuss representations of hierarchical Dirichlet processes in terms of
Mathematical challenges from genomics and molecular biology
- Notices of the American Mathematical Society
, 2002
"... Afundamental goal of biology is to understand how living cells function. This understanding is the foundation for all higher levels of explanation, including physiology, anatomy, behavior, ecology, and the study of populations. The field of molecular biology analyzes the functioning of cells and the ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
Afundamental goal of biology is to understand how living cells function. This understanding is the foundation for all higher levels of explanation, including physiology, anatomy, behavior, ecology, and the study of populations. The field of molecular biology analyzes the functioning of cells and the processes of inheritance principally in terms of interactions among three crucially important classes of macromolecules: DNA, RNA, and proteins. Proteins are the molecules that enable and execute most of the processes within a cell. DNA is the carrier of hereditary information in the form of genes and directs the production of proteins. RNA is a key intermediary between DNA and proteins. Molecular biology and genetics are undergoing revolutionary changes. These changes are guided by a view of a cell as a collection of interrelated subsystems, each involving the interaction among many genes and proteins. Emphasis has shifted from the study of individual genes and proteins to the exploration of the entire genome of an organism and the study of networks of genes and proteins. As the level of aspiration rises and the amount of available data grows by orders of magnitude, the field becomes increasingly dependent on mathematical modeling, mathematical analysis, and computation. In the sections that follow we give an introduction to the mathematical and computational challenges that arise in this field, with an emphasis on discrete Richard M. Karp is a member of the International Computer
Acquiring evolvability through adaptive representations
- In Proc. of Genetic and Evolutionary Computation Conference
, 2007
"... Adaptive representations allow evolution to explore the space of phenotypes by choosing the most suitable set of genotypic parameters. Although such an approach is believed to be efficient on complex problems, few empirical studies have been conducted in such domains. In this paper, three neural net ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Adaptive representations allow evolution to explore the space of phenotypes by choosing the most suitable set of genotypic parameters. Although such an approach is believed to be efficient on complex problems, few empirical studies have been conducted in such domains. In this paper, three neural network representations, a direct encoding, a complexifying encoding, and an implicit encoding capable of adapting the genotype-phenotype mapping are compared on Nothello, a complex game playing domain from the AAAI General Game Playing Competition. Implicit encoding makes the search more efficient and uses several times fewer parameters. Random mutation leads to highly structured phenotypic variation that is acquired during the course of evolution rather than built into the representation itself. Thus, adaptive representations learn to become evolvable, and furthermore do so in a way that makes search efficient on difficult coevolutionary problems.
Time-Varying Dynamic Bayesian Networks
"... Directed graphical models such as Bayesian networks are a favored formalism for modeling the dependency structures in complex multivariate systems such as those encountered in biology and neural science. When a system is undergoing dynamic transformation, temporally rewiring networks are needed for ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Directed graphical models such as Bayesian networks are a favored formalism for modeling the dependency structures in complex multivariate systems such as those encountered in biology and neural science. When a system is undergoing dynamic transformation, temporally rewiring networks are needed for capturing the dynamic causal influences between covariates. In this paper, we propose time-varying dynamic Bayesian networks (TV-DBN) for modeling the structurally varying directed dependency structures underlying non-stationary biological/neural time series. This is a challenging problem due the non-stationarity and sample scarcity of time series data. We present a kernel reweighted ℓ1-regularized auto-regressive procedure for this problem which enjoys nice properties such as computational efficiency and provable asymptotic consistency. To our knowledge, this is the first practical and statistically sound method for structure learning of TV-DBNs. We applied TV-DBNs to time series measurements during yeast cell cycle and brain response to visual stimuli. In both cases, TV-DBNs reveal interesting dynamics underlying the respective biological systems. 1
BayCis: A Bayesian Hierarchical HMM for Cis-Regulatory Module Decoding in Metazoan Genomes
"... Abstract. The transcriptional regulatory sequences in metazoan genomes often consist of multiple cis-regulatory modules (CRMs). Each CRM contains locally enriched occurrences of binding sites (motifs) for a certain array of regulatory proteins, capable of integrating, amplifying or attenuating multi ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract. The transcriptional regulatory sequences in metazoan genomes often consist of multiple cis-regulatory modules (CRMs). Each CRM contains locally enriched occurrences of binding sites (motifs) for a certain array of regulatory proteins, capable of integrating, amplifying or attenuating multiple regulatory signals via combinatorial interaction with these proteins. The architecture of CRM organizations is reminiscent of the grammatical rules underlying a natural language, and presents a particular challenge to computational motif and CRM identification in metazoan genomes. In this paper, we present BayCis, a Bayesian hierarchical HMM that attempts to capture the stochastic syntactic rules of CRM organization. Under the BayCis model, all candidate sites are evaluated based on a posterior probability measure that takes into consideration their similarity to known BSs, their contrasts against local genomic context, their firstorder dependencies on upstream sequence elements, as well as priors reflecting general knowledge of CRM structure. We compare our approach to five existing methods for the discovery of CRMs, and demonstrate competitive or superior prediction results evaluated against experimentally based annotations on a comprehensive selection of Drosophila regulatory regions. The software, database and Supplementary Materials will be available at
Small World and Scale-Free Network Topologies in an Artificial Regulatory Network
, 2004
"... Small world and scale--free network topologies commonly exist in natural and artificial systems. Many mechanisms for producing these topologies have been presented in the literature. ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Small world and scale--free network topologies commonly exist in natural and artificial systems. Many mechanisms for producing these topologies have been presented in the literature.
Increasing Feasibility of Optimal Gene Network Estimation
- Genome Informatics
, 2004
"... Disentangling networks of regulation of gene expression is a major challenge in the field of computational biology. Harvesting the information contained in microarray data sets is a promising approach towards this challenge. We propose an algorithm for the optimal estimation of Bayesian networks fro ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Disentangling networks of regulation of gene expression is a major challenge in the field of computational biology. Harvesting the information contained in microarray data sets is a promising approach towards this challenge. We propose an algorithm for the optimal estimation of Bayesian networks from microarray data, which reduces the CPU time and memory consumption of previous algorithms. We prove that the space complexity can be reduced from O(n )toO(2 ), and that the expected calculation time can be reduced from O(n )toO(n ), where n is the number of genes. We make intrinsic use of a limitation of the maximal number of regulators of each gene, which has biological as well as statistical justifications. The improvements are significant for some applications in research.
Divergence of Conserved Non-Coding Sequences: Rate Estimates and Relative Rate Tests
"... Author for correspondence ..."
Practical Computational Methods for Regulatory Genomics: A cisGRN-Lexicon and cisGRN-Browser for Gene Regulatory Networks
"... The CYRENE Project focuses on the study of cis-regulatory genomics and gene regulatory networks (GRN) and has three components: a cisGRN-Lexicon, a cisGRN-Browser, and the Virtual Sea Urchin software system. The project has been done in collaboration with Eric Davidson and is deeply inspired by his ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
The CYRENE Project focuses on the study of cis-regulatory genomics and gene regulatory networks (GRN) and has three components: a cisGRN-Lexicon, a cisGRN-Browser, and the Virtual Sea Urchin software system. The project has been done in collaboration with Eric Davidson and is deeply inspired by his experimental work in genomic regulatory systems and gene regulatory networks. The current CYRENE cisGRN-Lexicon contains the regulatory architecture of 200 transcription factors encoding genes and 100 other regulatory genes in eight species: human, mouse, fruit fly, sea urchin, nematode, rat, chicken, and zebrafish, with higher priority on the first five species. The only regulatory genes included in the cisGRN-Lexicon (CYRENE genes) are those whose regulatory architecture is validated by what we call the Davidson Criterion: they contain functionally authenticated sites by site-specific mutagenesis, conducted in vivo, and followed by gene transfer and functional test. This is recognized as the most stringent experimental validation criterion to date for such a genomic regulatory architecture. The CYRENE cisGRN-Browser is a full genome browser tailored for cis-regulatory annotation and investigation. It began as a branch of the Celera Genome Browser (available as open source at

