Results 1 - 10
of
46
Computational Identification of Cis-regulatory Elements Associated with Groups of Functionally Related Genes in . . .
- J. MOL. BIOL
, 2000
"... ... runs on randomly selected sets of genes and on sets of genes whose upstream regions contain known transcription factor binding sites serve as controls. ..."
Abstract
-
Cited by 153 (7 self)
- Add to MetaCart
... runs on randomly selected sets of genes and on sets of genes whose upstream regions contain known transcription factor binding sites serve as controls.
A Comprehensive Library of DNA-binding Site Matrices for 55 Proteins Applied to the Complete Escherichia coli K-12 Genome
- J. Mol. Biol
, 1998
"... Introduction Sequence-specic DNA-binding proteins perform a multitude of roles in a living cell and regulate a variety of processes including transcription. Escherichia coli contains at least 240 proteins that are known or predicted to be DNA-binding proteins (Robison, 1997). Known binding sites fo ..."
Abstract
-
Cited by 64 (1 self)
- Add to MetaCart
Introduction Sequence-specic DNA-binding proteins perform a multitude of roles in a living cell and regulate a variety of processes including transcription. Escherichia coli contains at least 240 proteins that are known or predicted to be DNA-binding proteins (Robison, 1997). Known binding sites for a DNA-binding protein can be used to identify additional sites for that protein, and thereby identify further genes regulated by that protein (Wasserman & Fickett, 1998; Tronche et al., 1997; Fondrat & Kalogeropoulos, 1996; Goodrich et al., 1990; Lewis et al., 1994; Ramseier et al., 1995; Stormo, 1990; Verbeek et al., 1990). A number of approaches have been used to search for additional sites, including searches using consensus sequences, and searches using position weight matrices. Fondrat & Kalogeropoulos (1996) used a precise set of rules and constraints together with a degenerate co
Finding composite regulatory patterns in DNA sequences
- Bioinformatics
, 2002
"... Pattern discovery in unaligned DNA sequences is a fundamental problem in computational biology with important applications in finding regulatory signals. Current approaches to pattern discovery focus on monad patterns that correspond to relatively short contiguous strings. However, many of the actua ..."
Abstract
-
Cited by 59 (3 self)
- Add to MetaCart
Pattern discovery in unaligned DNA sequences is a fundamental problem in computational biology with important applications in finding regulatory signals. Current approaches to pattern discovery focus on monad patterns that correspond to relatively short contiguous strings. However, many of the actual regulatory signals are composite patterns that are groups of monad patterns that occur near each other. A difficulty in discovering composite patterns is that one or both of the component monad patterns in the group may be “too weak”. Since the traditional monad-based motif finding algorithms usually output one (or a few) high scoring patterns, they often fail to find composite regulatory signals consisting of weak monad parts. In this paper, we present a MITRA (MIsmatch TRee Algorithm) approach for discovering composite signals. We demonstrate that MITRA performs well for both monad and composite patterns by presenting experiments over biological and synthetic data. Availability: MITRA is available at
Identifying Target Sites for Cooperatively Binding Factors
, 2001
"... Motivation: Transcriptional activation in eukaryotic organisms normally requires combinatorial interactions of multiple transcription factors. Though several methods exist for identification of individual protein binding site patterns in DNA sequences, there are few methods for discovery of binding ..."
Abstract
-
Cited by 54 (3 self)
- Add to MetaCart
Motivation: Transcriptional activation in eukaryotic organisms normally requires combinatorial interactions of multiple transcription factors. Though several methods exist for identification of individual protein binding site patterns in DNA sequences, there are few methods for discovery of binding site patterns for cooperatively acting factors. Here we present an algorithm, Co-Bind (for COperative BINDing) , for discovering DNA target sites for cooperatively acting transcription factors. The method utilizes a Gibbs sampling strategy to model the cooperativity between two transcription factors and defines position weight matrices for the binding sites. Sequences from both the training set and the entire genome are taken into account, in order to discriminate against commonly occurring patterns in the genome, and produce patterns which are significant only in the training set. Results: We have tested Co-Bind on semi-synthetic and real data sets to show it can efficiently identify DNA target site patterns for cooperatively binding transcription factors. In cases where binding site patterns are weak and cannot be identified by other available methods, Co-Bind, by virtue of modeling the cooperativity between factors, can identify those sites efficiently. Though developed to model protein-- DNA interactions, the scope of Co-Bind may be extended to combinatorial, sequence specific, interactions in other macromolecules. Availability: The program is available upon request from the authors or may be downloaded from http://ural.wustl. edu. Contact: dg@genetics.wustl.edu; stormo@genetics.wustl.edu 1
Eukaryotic Promoter Recognition
- Genome Res
, 1997
"... 957> http://gnomic.stanford.edu/~chris/GENSCANW. html). Because the signals that control the start and stop of transcription and translation, and the location of splicing, are still not very well understood, it is not uncommon for a gene-finding algorithm to confuse internal with initial and termina ..."
Abstract
-
Cited by 53 (0 self)
- Add to MetaCart
957> http://gnomic.stanford.edu/~chris/GENSCANW. html). Because the signals that control the start and stop of transcription and translation, and the location of splicing, are still not very well understood, it is not uncommon for a gene-finding algorithm to confuse internal with initial and terminal exons, thus wrongly partitioning the exons. The problem is compounded by our incomplete understanding of alternative splicing control elements. Another line of development in gene identification is based on homology (e.g., Gish and States 1993; Gelfand et al. 1996). If there is a close homolog in the databases to one of the genes in the sequence under analysis, sequence similarity will usually group the exons for this gene correctly. Still, in many cases there is no close homolog and no guarantee when there is some homolog that the encoded protein lacks insertions/deletions. Clearly, some means of recognizing the beginnings of genes, probably via the promoter, or the ends, probabl
Weeder Web: discovery of transcription factor binding sites in a set of sequences from co-regulated genes
- Nucleic Acids Res
, 2004
"... One of the greatest challenges that modern molecular biology is facing is the understanding of the complex mechanisms regulating gene expression. A fundamental step in this process requires the characterization of regulatory motifs playing key roles in the regulation of gene expression at transcript ..."
Abstract
-
Cited by 46 (1 self)
- Add to MetaCart
One of the greatest challenges that modern molecular biology is facing is the understanding of the complex mechanisms regulating gene expression. A fundamental step in this process requires the characterization of regulatory motifs playing key roles in the regulation of gene expression at transcriptional and post-transcriptional levels. In particular, transcription is modulated by the interaction of transcription factors with their corresponding binding sites. Weeder Web is a web interface to Weeder, an algorithm for the automatic discovery of conserved motifs in a set of related regulatory DNA sequences. The motifs found are in turn likely to be instances of binding sites for some transcription factor. Other than providing access to the program, the interface has been designed so to make usage of the program itself as simple as possible, and to require very little prior knowledge about the length and the conservation of the motifs to be found. In fact, the interface automatically starts different runs of the program, each one with different parameters, and provides the user with an overall summary of the results as well as some ‘advice ’ on which motifs look more interesting according to their statistical significance and some simple considerations. The web interface is available at the address www.pesolelab.it by following the ‘Tools ’ link.
Identification of sparsely distributed clusters of cis-regulatory elements in sets of co-expressed genes
, 2004
"... ..."
A Structure-Based Approach for Prediction of Protein Binding Sites in Gene Upstream Regions
- Pac. Symp. Biocomput
, 2001
"... Introduction Sequences upstream transcription start positions play a major role in the regulation of gene expression. They are recognized by regulatory proteins which act upon binding as transcription repressors or activators, controlling the rate of transcription initiation. The identification of ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
Introduction Sequences upstream transcription start positions play a major role in the regulation of gene expression. They are recognized by regulatory proteins which act upon binding as transcription repressors or activators, controlling the rate of transcription initiation. The identification of such sequences upstream from a specific gene is therefore essential for understanding its transcription regulation. Traditionally, the identification of DNA regulatory sequences and of the base pairs that play a role in specific binding has been carried out by a variety of experimental methods. These include mutation analysis and direct binding measurements (e.g. Takeda et al., 1989), selection experiments by phage display libraries (e.g. Choo & Klug, 1994), and co-crystalization of the protein-DNA complex (e.g. Kim & Burley, 1994). Presently, with the accumulation of many new gene sequences due to the large-scale genomic sequencing projects, we are faced by the challenge o
Data Mining for Regulatory Elements in Yeast Genome
, 1997
"... We have examined methods and developed a general software tool for finding and analyzing combinations of transcription factor binding sites that occur relatively often in gene upstream regions (putative promoter regions) in the yeast genome. Such frequently occurring combinations may be essential pa ..."
Abstract
-
Cited by 12 (3 self)
- Add to MetaCart
We have examined methods and developed a general software tool for finding and analyzing combinations of transcription factor binding sites that occur relatively often in gene upstream regions (putative promoter regions) in the yeast genome. Such frequently occurring combinations may be essential parts of possible promoter classes. The regions upstream to all genes were first isolated from the yeast genome database MIPS using the information in the annotation files of the database. The ones that do not overlap with coding regions were chosen for further studies. Next, all occurrences of the yeast transcription factor binding sites, as given in the IMD database, were located in the genome and in the selected regions in particular. Finally, by using a general purpose data mining software in combination with our own software, which parametrizes the search, we can find the combinations of binding sites that occur in the upstream regions more frequently than would be expected on the basis o...

