Results 1 - 10
of
16
A Gibbs sampler for identification of symmetrically structured, spaced DNA motifs with improved estimation of the signal length
, 2005
"... ..."
Integrating genomic data to predict transcription factor binding
- Genome Inform. Ser. Workshop Genome Inform
, 2005
"... Transcription factor binding sites (TFBS) in gene promoter regions are often predicted by using position specific scoring matrices (PSSMs), which summarize sequence patterns of experimentally determined TF binding sites. Although PSSMs are more reliable than simple consensus string matching in predi ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Transcription factor binding sites (TFBS) in gene promoter regions are often predicted by using position specific scoring matrices (PSSMs), which summarize sequence patterns of experimentally determined TF binding sites. Although PSSMs are more reliable than simple consensus string matching in predicting a true binding site, they generally result in high numbers of false positive hits. This study attempts to reduce the number of false positive matches and generate new predictions by integrating various types of genomic data by two methods: a Bayesian allocation procedure, and support vector machine classification. Several methods will be explored to strengthen the prediction of a true TFBS in the Saccharomyces cerevisiae genome: binding site degeneracy, binding site conservation, phylogenetic profiling, TF binding site clustering, gene expression profiles, GO functional annotation, and k-mer counts in promoter regions. Binding site degeneracy (or redundancy) refers to the number of times a particular transcription factor’s binding motif is discovered in the upstream region of a gene. Phylogenetic conservation takes into account the number of orthologous upstream regions in other genomes that contain a particular binding site. Phylogenetic profiling refers to the presence or
SeqVISTA: a new module of integrated computational tools for studying transcriptional regulation. Nucleic Acids Res. 2004 Jul 1;32(Web Server
"... Transcriptional regulation is one of the most basic regulatory mechanisms in the cell. The accumulation of multiple metazoan genome sequences and the advent of high-throughput experimental techniques have motivated the development of a large number of bioinformatics methods for the detection of regu ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Transcriptional regulation is one of the most basic regulatory mechanisms in the cell. The accumulation of multiple metazoan genome sequences and the advent of high-throughput experimental techniques have motivated the development of a large number of bioinformatics methods for the detection of regulatory motifs. The regulatory process is extremely complex and individual computational algorithms typically have very limited success in genome-scale studies. Here, we argue the importance of integrating multiple computational algorithms and present an infrastructure that integrates eight web services covering key areas of transcriptional regulation. We have adopted the client-side integration technology and built a consistent input and output environment with a versatile visualization tool named SeqVISTA. The infrastructure will allow for easy integration of gene regulation analysis software that is scattered over the Internet. It will also enable bench biologists to perform an arsenal of analysisusingcutting-edgemethodsinafamiliarenvironment and bioinformatics researchers to focus on developing new algorithms without the need to invest substantial effort on complex pre- or post-processors. SeqVISTA is freely available to academic users and can be launched online at
Genome-wide Analysis of Functions Regulated by Sets of Transcription Factors
"... ∗ both authors contributed equally Abstract: We present a pipeline for inferring biological functions regulated by a combinatorial interaction of transcription factors. Using a robust statistical method the pipeline intersects the presence of transcription factor binding sites in gene upstream seque ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
∗ both authors contributed equally Abstract: We present a pipeline for inferring biological functions regulated by a combinatorial interaction of transcription factors. Using a robust statistical method the pipeline intersects the presence of transcription factor binding sites in gene upstream sequences with Gene Ontology terms associated with these genes. Positional frequency matrices for the transcription factors constitute the input of the pipeline and significantly enriched biological processes are reported as the output. We demonstrate the usage of the pipeline using two groups of transcription factors: a cell-cycle related family of E2F factors and a NFAT/AP-1 pair involved in immune response. In both cases the reported results match well the experimental knowledge. Furthermore, for the NFAT/AP-1 composite element novel functions are predicted.
unknown title
"... Vol. 25 ISMB 2009, pages i321–i329 doi:10.1093/bioinformatics/btp230 DISCOVER: a feature-based discriminative method for motif search in complex genomes ..."
Abstract
- Add to MetaCart
Vol. 25 ISMB 2009, pages i321–i329 doi:10.1093/bioinformatics/btp230 DISCOVER: a feature-based discriminative method for motif search in complex genomes
Integrating Genomic Data to Predict Transcription Factor Binding
"... Transcription factor binding sites (TFBS) in gene promoter regions are often predicted by using position specific scoring matrices (PSSMs), which summarize sequence patterns of experimentally determined TF binding sites. Although PSSMs are more reliable than simple consensus string matching in predi ..."
Abstract
- Add to MetaCart
Transcription factor binding sites (TFBS) in gene promoter regions are often predicted by using position specific scoring matrices (PSSMs), which summarize sequence patterns of experimentally determined TF binding sites. Although PSSMs are more reliable than simple consensus string matching in predicting a true binding site, they generally result in high numbers of false positive hits. This study attempts to reduce the number of false positive matches and generate new predictions by integrating various types of genomic data by two methods: a Bayesian allocation procedure, and support vector machine classification. Several methods will be explored to strengthen the prediction of a true TFBS in the Saccharomyces cerevisiae genome: binding site degeneracy, binding site conservation, phylogenetic profiling, TF binding site clustering, gene expression profiles, GO functional annotation, and k-mer counts in promoter regions. Binding site degeneracy (or redundancy) refers to the number of times a particular transcription factor’s binding motif is discovered in the upstream region of a gene. Phylogenetic conservation takes into account the number of orthologous upstream regions in other genomes that contain a particular binding site. Phylogenetic profiling refers to the presence or
109 USING TRANSCRIPTION FACTOR BINDING SITE CO-OCCURRENCE TO PREDICT REGULATORY REGIONS
"... Transcription factors (TFs) bind to the regulatory regions of genes in a cooperative manner. This article describes a method to detect pairs of transcription factor binding sites which co-occur in known regulatory regions more often than expected by mere combination of the individual binding sites. ..."
Abstract
- Add to MetaCart
Transcription factors (TFs) bind to the regulatory regions of genes in a cooperative manner. This article describes a method to detect pairs of transcription factor binding sites which co-occur in known regulatory regions more often than expected by mere combination of the individual binding sites. We determine frequently co-occurring TF pairs and evaluate the method using known TF interactions. Furthermore we use co-occurrence scores to assess the regulatory potential of a sequence region by calculating a graph-based score. We show results for the score on known regulatory regions.
Supervised by
"... The massive advancements in genomics over recent years have not only provided scope to examine what is shared between the genomes of multiple species but also a unique opportunity to investigate that which is responsible for the differences between species of interest. By comparing the proteomes of ..."
Abstract
- Add to MetaCart
The massive advancements in genomics over recent years have not only provided scope to examine what is shared between the genomes of multiple species but also a unique opportunity to investigate that which is responsible for the differences between species of interest. By comparing the proteomes of two species, certain genes can be clustered and defined as ‘inparalogs ’- duplicated genes which are respectively unique to each of the species in question having arisen at a time-point subsequent to the speciation event that separated the two lineages in question. Here I report on several analyses that make use of inparalogous genes identified in the mouse genome with reference to its close relative the rat. Firstly I describe the implementation of a novel investigative procedure that identifies regions of intragenomic conservation within the upstream sequences of inparalogous gene clusters and report upon the level of resolution that this approach offers with respect to identifying regulatory elements in genomic sequences. In addition to this study, I also describe an investigation into the density of interspersed repeat elements observed in the neighbourhood of

