Results 1 -
8 of
8
A hypothesis-based approach for identifying the binding specificity of regulatory proteins from chromatin immunoprecipitation data
- Bioinformatics
, 2006
"... Motivation: Genome-wide chromatin-immunoprecipitation (ChIPchip) detects binding of transcriptional regulators to DNA in vivo at low resolution. Motif discovery algorithms can be used to discover sequence patterns in the bound regions that may be recognized by the immunoprecipitated protein. However ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Motivation: Genome-wide chromatin-immunoprecipitation (ChIPchip) detects binding of transcriptional regulators to DNA in vivo at low resolution. Motif discovery algorithms can be used to discover sequence patterns in the bound regions that may be recognized by the immunoprecipitated protein. However, the discovered motifs often do not agree with the binding specificity of the protein, when it is known. Results: We present a powerful approach to analyzing ChIP-chip data, called THEME, that tests hypotheses concerning the sequence specificity of a protein. Hypotheses are refined using constrained local optimization. Cross-validation provides a principled standard for selecting the optimal weighting of the hypothesis and the ChIPchip data and for choosing the best refined hypothesis. We demonstrate how to derive hypotheses for proteins from 36 domain families. Using THEME together with these hypotheses, we analyze ChIP-chip datasets for fourteen human and mouse proteins. In all cases the identified motifs are consistent with published data regarding the binding specificity of the proteins. Availability: THEME is freely available for download.
Extracting sequence features to predict protein-DNA interactions: A comparative study
- Nucleic Acids Research
, 2008
"... Predicting how and where proteins, especially transcription factors (TFs), interact with DNA is an important problem in biology. We present here a systematic study of predictive modeling approaches to the TF-DNA binding problem, which have been frequently shown to be more efficient than those method ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Predicting how and where proteins, especially transcription factors (TFs), interact with DNA is an important problem in biology. We present here a systematic study of predictive modeling approaches to the TF-DNA binding problem, which have been frequently shown to be more efficient than those methods only based on position-specific weight matrices (PWMs). In these approaches, a statistical relationship between genomic sequences and gene expression or ChIPbinding intensities is inferred through a regression framework; and influential sequence features are identified by variable selection. We examine a few state-of-the-art learning methods including stepwise linear regression, multivariate adaptive regression splines (MARS), neural networks, support vector machines, boosting, and Bayesian additive regression trees (BART). These methods are applied to both simulated datasets and two whole-genome ChIP-chip datasets on the TFs Oct4 and Sox2, respectively, in human embryonic stem cells. We find that, with proper learning methods, predictive modeling approaches can significantly improve the predictive power and identify more biologically interesting features, such as TF-TF interactions, than the PWM approach. In particular, BART and boosting show the best and the most robust overall performance among all the methods.
H (2005) Predicting transcription factor binding sites using structural knowledge
- Proceedings of the Ninth International Conference on Research in Computational Molecular Biology: Lecture notes in computer science, Volume 3,500
, 2005
"... Abstract. Current approaches for identification and detection of transcription factor binding sites rely on an extensive set of known target genes. Here we describe a novel structure-based approach applicable to transcription factors with no prior binding data. Our approach combines sequence data an ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Abstract. Current approaches for identification and detection of transcription factor binding sites rely on an extensive set of known target genes. Here we describe a novel structure-based approach applicable to transcription factors with no prior binding data. Our approach combines sequence data and structural information to infer context-specific amino acid-nucleotide recognition preferences. These are used to predict binding sites for novel transcription factors from the same structural family. We apply our approach to the Cys2His2 Zinc Finger protein family, and show that the learned DNA-recognition preferences are compatible with various experimental results. To demonstrate the potential of our algorithm, we use the learned preferences to predict binding site models for novel proteins from the same family. These models are then used in genomic scans to find putative binding sites of the novel proteins. 1
unknown title
, 2006
"... doi:10.1093/nar/gkl1155 Structure-based prediction of C2H2 zinc-finger binding specificity: sensitivity to docking geometry ..."
Abstract
- Add to MetaCart
doi:10.1093/nar/gkl1155 Structure-based prediction of C2H2 zinc-finger binding specificity: sensitivity to docking geometry
unknown title
, 2006
"... A graph-based motif detection algorithm models complex nucleotide dependencies in transcription factor binding sites ..."
Abstract
- Add to MetaCart
A graph-based motif detection algorithm models complex nucleotide dependencies in transcription factor binding sites
“Doctor of Philosophy”
"... All cells of a living organism share the same DNA. Yet, they differ in structure, activities and interactions. These differences arise through a tight regulatory system which activates different genes and pathways to fit the cell’s specialization, condition, and requirements. Deciphering the regulat ..."
Abstract
- Add to MetaCart
All cells of a living organism share the same DNA. Yet, they differ in structure, activities and interactions. These differences arise through a tight regulatory system which activates different genes and pathways to fit the cell’s specialization, condition, and requirements. Deciphering the regulatory mechanisms underlying a living cell is one of the fundamental challenges in biology. Such knowledge will allow us to better understand how cells work, how they respond to external stimuli, what goes wrong in diseases like cancer (which often involves disruption of gene regulation), and how it can be fought. In my PhD, I focus on regulation of gene expression from three perspectives. First, I present an innovative algorithm for identifying the target genes of novel transcription factors, based on their protein sequence (Chapter 1). Second, I consider how several transcription factors cooperate to process external stimuli and alter the behavior of the cell (Chapter 2). Finally, I study how the genomic position of nucleosomes and their covalent modifications modulate the accessibility of DNA to transcription factors, thus adding a fascinating dimension to transcriptional regulation (Chapters 3 and 4).
unknown title
"... A hypothesis-based approach for identifying the binding specificity of regulatory proteins from chromatin immunoprecipitation data ..."
Abstract
- Add to MetaCart
A hypothesis-based approach for identifying the binding specificity of regulatory proteins from chromatin immunoprecipitation data

