Results 11 - 20
of
947
Identifying Target Sites for Cooperatively Binding Factors
, 2001
"... Motivation: Transcriptional activation in eukaryotic organisms normally requires combinatorial interactions of multiple transcription factors. Though several methods exist for identification of individual protein binding site patterns in DNA sequences, there are few methods for discovery of binding ..."
Abstract
-
Cited by 98 (3 self)
- Add to MetaCart
Motivation: Transcriptional activation in eukaryotic organisms normally requires combinatorial interactions of multiple transcription factors. Though several methods exist for identification of individual protein binding site patterns in DNA sequences, there are few methods for discovery of binding site patterns for cooperatively acting factors. Here we present an algorithm, Co-Bind (for COperative BINDing) , for discovering DNA target sites for cooperatively acting transcription factors. The method utilizes a Gibbs sampling strategy to model the cooperativity between two transcription factors and defines position weight matrices for the binding sites. Sequences from both the training set and the entire genome are taken into account, in order to discriminate against commonly occurring patterns in the genome, and produce patterns which are significant only in the training set. Results: We have tested Co-Bind on semi-synthetic and real data sets to show it can efficiently identify DNA target site patterns for cooperatively binding transcription factors. In cases where binding site patterns are weak and cannot be identified by other available methods, Co-Bind, by virtue of modeling the cooperativity between factors, can identify those sites efficiently. Though developed to model protein-- DNA interactions, the scope of Co-Bind may be extended to combinatorial, sequence specific, interactions in other macromolecules. Availability: The program is available upon request from the authors or may be downloaded from http://ural.wustl. edu. Contact: dg@genetics.wustl.edu; stormo@genetics.wustl.edu 1
Gibbs recursive sampler: finding transcription factor binding sites
- Nucleic Acids Res
, 2003
"... The Gibbs Motif Sampler is a software package for locating common elements in collections of biopolymer sequences. In this paper we describe a new variation of the Gibbs Motif Sampler, the Gibbs Recursive Sampler, which has been developed specifically for locating multiple transcription factor bindi ..."
Abstract
-
Cited by 92 (7 self)
- Add to MetaCart
(Show Context)
The Gibbs Motif Sampler is a software package for locating common elements in collections of biopolymer sequences. In this paper we describe a new variation of the Gibbs Motif Sampler, the Gibbs Recursive Sampler, which has been developed specifically for locating multiple transcription factor binding sites for multiple transcription factors simultaneously in unaligned DNA sequences that may be heterogeneous in DNA composition. Here we describe the basic operation of the web-based version of this sampler. The sampler may be accessed at
STAMP: a web tool for exploring DNA-binding motif similarities.” Nucleic Acids Res, 35(Web Server issue
, 2007
"... doi:10.1093/nar/gkm272 ..."
(Show Context)
Modeling within-motif dependence for transcription factor binding site predictions
- Bioinformatics
, 2004
"... Motivation: The position-specific weight matrix (PWM) model, which assumes that each position in the DNA site contributes independently to the overall protein–DNA inter-action, has been the primary means to describe transcrip-tion factor binding site motifs. Recent biological experiments, however, s ..."
Abstract
-
Cited by 70 (6 self)
- Add to MetaCart
(Show Context)
Motivation: The position-specific weight matrix (PWM) model, which assumes that each position in the DNA site contributes independently to the overall protein–DNA inter-action, has been the primary means to describe transcrip-tion factor binding site motifs. Recent biological experiments, however, suggest that there exists interdependence among positions in the binding sites. In order to exploit this interde-pendence to aid motif discovery, we extend the PWM model to include pairs of correlated positions and design a Markov chain Monte Carlo algorithm to sample in the model space. We then combine the model sampling step with the Gibbs sampling framework for de novo motif discoveries. Results: Testing on experimentally validated binding sites, we find that about 25 % of the transcription factor binding motifs show significant within-site position correlations, and 80 % of these motif models can be improved by considering the correlated positions. Using both simulated data and real pro-moter sequences, we show that the new de novo motif-finding algorithm can infer the true correlated position pairs accur-ately and is more precise in finding putative transcription factor binding sites than the standard Gibbs sampling algorithms.
From Promoter Sequence to Expression: A Probabilistic Framework
, 2001
"... We present a probabilistic framework that models the process by which transcriptional binding explains the mRNA expression of different genes. Our joint probabilistic model unifies the two key components of this process: the prediction of gene regulation events from sequence motifs in the gene’s pro ..."
Abstract
-
Cited by 69 (8 self)
- Add to MetaCart
(Show Context)
We present a probabilistic framework that models the process by which transcriptional binding explains the mRNA expression of different genes. Our joint probabilistic model unifies the two key components of this process: the prediction of gene regulation events from sequence motifs in the gene’s promoter region, and the prediction of mRNA expression from combinations of gene regulation events in different settings. Our approach has several advantages. By learning promoter sequence motifs that are directly predictive of expression data, it can improve the identification of binding site patterns. It is also able to identify combinatorial regulation via interactions of different transcription factors. Finally, the general framework allows us to integrate additional data sources, including data from the recent binding localization assays. We demonstrate our approach on the cell cycle data of Spellman et al., combined with the binding localization information of Simon et al. We show that the learned model predicts expression from sequence, and that it identifies coherent co-regulated groups with significant transcription factor motifs. It also provides valuable biological insight into the domain via these co-regulated “modules” and the combinatorial regulation effects that govern their behavior.
A hybrid micro-macroevolutionary approach to gene tree reconstruction
- J. Comput. Biol
, 2006
"... Gene family evolution is determined by microevolutionary processes (e.g., point mutations) and macroevo-lutionary processes (e.g., gene duplication and loss), yet macroevolutionary considerations are rarely incor-porated into gene phylogeny reconstruction methods. We present a dynamic program to fin ..."
Abstract
-
Cited by 64 (4 self)
- Add to MetaCart
(Show Context)
Gene family evolution is determined by microevolutionary processes (e.g., point mutations) and macroevo-lutionary processes (e.g., gene duplication and loss), yet macroevolutionary considerations are rarely incor-porated into gene phylogeny reconstruction methods. We present a dynamic program to find the most parsi-monious gene family tree with respect to a macroevolutionary optimization criterion, the weighted sum of the number of gene duplications and losses. The existence of a polynomial delay algorithm for duplication/loss phylogeny reconstruction stands in contrast to most formulations of phylogeny reconstruction, which are NP-complete. We next extend this result to obtain a two-phase method for gene tree reconstruction that takes both micro- and macroevolution into account. In the first phase, a gene tree is constructed from sequence data, using any of the previously known algorithms for gene phylogeny construction. In the second phase, the tree is refined by rearranging regions of the tree that do not have strong support in the sequence data to minimize the duplication/lost cost. Components of the tree with strong support are left intact. This hybrid approach incorporates both micro- and macroevolutionary considerations, yet its computational requirements are modest in practice because the two phase approach constrains the search space. Our hybrid algorithm can
A Simple Hyper-Geometric Approach for Discovering Putative Transcription Factor Binding Sites
- Algorithms in Bioinformatics: Proc. First International Workshop, number 2149 in LNCS
, 2001
"... A central issue in molecular biology is understanding the regulatory mechanisms that control gene expression. The recent ood of genomic and post-genomic data opens the way for computational methods elucidating the key components that play a role in these mechanisms. ..."
Abstract
-
Cited by 59 (6 self)
- Add to MetaCart
(Show Context)
A central issue in molecular biology is understanding the regulatory mechanisms that control gene expression. The recent ood of genomic and post-genomic data opens the way for computational methods elucidating the key components that play a role in these mechanisms.
Computational identification of transcriptional regulatory elements in DNA sequence
, 2006
"... Identification and annotation of all the functional elements in the genome, including genes and the regulatory sequences, is a fundamental challenge in genomics and computational biology. Since regulatory elements are frequently short and variable, their identification and discovery using computatio ..."
Abstract
-
Cited by 55 (0 self)
- Add to MetaCart
Identification and annotation of all the functional elements in the genome, including genes and the regulatory sequences, is a fundamental challenge in genomics and computational biology. Since regulatory elements are frequently short and variable, their identification and discovery using computational algorithms is difficult. However, significant advances have been made in the computational methods for modeling and detection of DNA regulatory elements. The availability of complete genome sequence from multiple organisms, as well as mRNA profiling and high-throughput experimental methods for mapping protein-binding sites in DNA, have contributed to the development of methods that utilize these auxiliary data to inform the detection of transcriptional regulatory elements. Progress is also being made in the identification of cis-regulatory modules and higher order structures of the regulatory sequences, which is essential to the understanding of transcription regulation in the metazoan genomes. This article reviews the computational approaches for modeling and identification of genomic regulatory elements, with an emphasis on the recent developments, and current challenges.
TJ: NestedMICA: sensitive inference of over-represented motifs in nucleic acid sequence
- Nucleic Acids Res
"... NestedMICA is a new, scalable, pattern-discovery system for finding transcription factor binding sites and similar motifs in biological sequences. Like several previous methods, NestedMICA tackles this problem by optimizing a probabilistic mixture model to fit a set of sequences. However, the use of ..."
Abstract
-
Cited by 55 (1 self)
- Add to MetaCart
(Show Context)
NestedMICA is a new, scalable, pattern-discovery system for finding transcription factor binding sites and similar motifs in biological sequences. Like several previous methods, NestedMICA tackles this problem by optimizing a probabilistic mixture model to fit a set of sequences. However, the use of a newly developed inference strategy called Nested Sampling means NestedMICA is able to find optimal solutions without the need for a problematic initialization or seeding step. We investigate the performance of NestedMICA in a range scenario, on synthetic data and a well-characterized set of muscle regulatory regions, and compare it with the popular MEME program. We show that the new method is significantly more sensitive than MEME: in one case, it successfully extracted a target motif from background sequence four times longer than could be handled by the existing program. It also performs robustly on synthetic sequences containing multiple significant motifs. When tested on a real set of regulatory sequences, NestedMICA produced motifs which were good predictors for all five abundant classes of annotated binding sites.
The COOH-terminal domain of Myo2p, a yeast myosin V, has a direct role in secretory vesicle targeting
- J. Cell
, 1999
"... Abstract. MYO2 encodes a type V myosin heavy chain needed for the targeting of vacuoles and secretory vesicles to the growing bud of yeast. Here we describe new myo2 alleles containing conditional lethal mutations in the COOH-terminal tail domain. Within 5 min of shifting to the restrictive temperat ..."
Abstract
-
Cited by 51 (7 self)
- Add to MetaCart
(Show Context)
Abstract. MYO2 encodes a type V myosin heavy chain needed for the targeting of vacuoles and secretory vesicles to the growing bud of yeast. Here we describe new myo2 alleles containing conditional lethal mutations in the COOH-terminal tail domain. Within 5 min of shifting to the restrictive temperature, the polarized distribution of secretory vesicles is abolished without affecting the distribution of actin or the mutant Myo2p, showing that the tail has a direct role in vesicle targeting. We also show that the actin cable–dependent translocation of Myo2p to growth sites does not require secretory vesicle cargo. Although a fusion protein containing the Myo2p tail also concentrates at growth sites, this accumulation depends on the polarized delivery of secretory vesicles, implying that the Myo2p tail binds to secretory vesicles. Most of the new mutations alter a region of the Myo2p tail conserved with vertebrate myosin Vs but divergent from Myo4p, the myosin V involved in mRNA transport, and genetic data suggest that the tail interacts with Smy1p, a kinesin homologue, and Sec4p, a vesicle-associated Rab protein. The data support a model in which the Myo2p tail tethers secretory vesicles, and the motor transports them down polarized actin cables to the site of exocytosis. Key words: cell polarity • myosin V • MYO2 gene product • exocytosis • Saccharomyces cerevisiae