## Regulatory Motif Discovery: from Decoding to Meta-Analysis

### BibTeX

@MISC{Zhou_regulatorymotif,

author = {Qing Zhou and Mayetri Gupta},

title = { Regulatory Motif Discovery: from Decoding to Meta-Analysis},

year = {}

}

### OpenURL

### Abstract

Gene transcription is regulated by interactions between transcription factors and their target binding sites in the genome. A motif is the sequence pattern recognized by a transcription factor to mediate such interactions. With the availability of high-throughput genomic data, computational identification of transcription factor binding motifs has become a major research problem in computational biology and bioinformatics. In this chapter, we present a series of Bayesian approaches to motif discovery. We start from a basic statistical framework for motif finding, extend it to the identification of cis-regulatory modules, and then discuss methods that combine motif finding with phylogenetic footprinting, gene expression or ChIP-chip data, and nucleosome positioning information. Simulation studies and applications to biological data sets are presented to illustrate the utility of these methods.

### Citations

8844 | Maximum likelihood from incomplete data via the EM algorithm
- Dempster, Laird, et al.
- 1977
(Show Context)
Citation Context ...s a probabilistic model to describe a fuzzy word. An early motif-finding approach was CONSENSUS, an information theorybased progressive alignment procedure [42]. Other methods included an EMalgorithm =-=[11]-=- based on a missing-data formulation [24], and a Gibbs sampling algorithm [23]. Later generalizations that allowed for a variable number of motif sites per sequence were a Gibbs sampler [28, 33] and a... |

4556 | A tutorial on hidden markov models and selected applications in speech recognition
- RABINER
- 1989
(Show Context)
Citation Context ...arrays pose considerable challenges for data analysis. These arrays involve short overlapping probes covering the genome, which induces a spatial data structure. Although hidden Markov models or HMMs =-=[35]-=- may be used to accommodate such spatial structure, they induce an exponentially decaying distribution of state lengths, and are not directly appropriate for assessing structural features such as nucl... |

1095 |
Evolutionary trees from DNA sequences: a maximum likelihood approach
- Felsenstein
- 1981
(Show Context)
Citation Context ...s defined as the background mutation rate. ZW assume an independent evolution192 Qing Zhou, Mayetri Gupta for each position (column) of a motif under the nucleotide substitution model of Felsenstein =-=[13]-=-. Suppose the weight vector of a particular position in the motif is θ. The ancestral nucleotide, denoted by Z, is assumed to follow a discrete distribution with the probability vector θ on {A, C, G, ... |

625 | Biological Sequence Analysis - Durbin, Eddy, et al. - 1998 |

617 |
The calculation of posterior distributions by data augmentation
- Tanner, WangW
- 1987
(Show Context)
Citation Context ...ewton-Raphson optimization procedure. The dictionary model was later extended to include “stochastic” words in order to account for variations in the motif sites [16, 36] and a data augmentation (DA) =-=[43]-=- procedure introduced for finding such words. Recent approaches to motif discovery have improved upon the previous methods in at least two primary ways: (i) improving and sensitizing the basic model t... |

609 | Fitting a mixture model by expectation maximization to discovermotifs in biopolymers
- Bailey, Elkan
- 1994
(Show Context)
Citation Context ..., and a Gibbs sampling algorithm [23]. Later generalizations that allowed for a variable number of motif sites per sequence were a Gibbs sampler [28, 33] and an EM algorithm for finite mixture models =-=[2]-=-. Another class of methods approach the motif discovery problem from a “segmentation” perspective. MobyDick [6] treats the motifs as “words” used by nature to construct the “sentences” of DNA and esti... |

554 | Hidden Markov models in computational biology: Applications to protein modeling
- Krogh, Brown, et al.
- 1994
(Show Context)
Citation Context ...ereafter) to sample from the joint posterior distribution of all the unknown parameters and missing data. To consider the uncertainty in multiple alignment, they adopt an HMM-based multiple alignment =-=[3, 22]-=- conditional on the current parameter values. This is achieved by adding a Metropolis-Hastings step in the Gibbs sampler to update these alignments dynamically according to the current sampled paramet... |

539 |
Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment
- Lawrence, Altschul, et al.
- 1993
(Show Context)
Citation Context ...ach was CONSENSUS, an information theorybased progressive alignment procedure [42]. Other methods included an EMalgorithm [11] based on a missing-data formulation [24], and a Gibbs sampling algorithm =-=[23]-=-. Later generalizations that allowed for a variable number of motif sites per sequence were a Gibbs sampler [28, 33] and an EM algorithm for finite mixture models [2]. Another class of methods approac... |

483 | Multivariate Adaptive Regression Splines
- Friedman
- 1991
(Show Context)
Citation Context ...ied BART to two recently published ChIP-chip data sets of the TFs Oct4 and Sox2 in human embryonic stem (ES) cells [5]. The performance of BART was compared with those of linear regressions [9], MARS =-=[14, 10]-=-, and neural networks, respectively, based on ten-fold cross validations. The DNA microarray used in [5] covers −8 kb to +2kb of ∼17,000 annotated human genes. A Sox-Oct composite motif (Figure 8) was... |

384 | Sequence logos: a new way to display consensus sequences. Nucleic Acids Res
- Schneider, Stephens
- 1990
(Show Context)
Citation Context ...enty modules, each of 100 bps and containing one binding site of each of the three TFs, were randomly placed in these sequences. TFBS’s were simulated from their known weight matrices with logo plots =-=[38]-=- shown in Figure 3. Then based on the choices of the background mutation rate µb (with α = 3β in equation 4.3) and the motif mutation rate µf, they generated sequences of three descendant species acco... |

338 |
Sequencing and comparison of yeast species to identify genes and regulatory elements
- Kellis
- 2003
(Show Context)
Citation Context ... data sources, such as gene expression microarrays, ChIP-chip data, phylogenetic information and the physical structure of DNAChapter 8 Regulatory Motif Discovery: from Decoding to Meta-Analysis 181 =-=[9, 21, 52, 18]-=-. In the following section we will discuss the general framework of de-novo methods for discovering uncharacterized motifs in biological sequences, focusing especially on the Bayesian approach. 2 A Ba... |

287 |
JASPAR: an open access database for eukaryotic transcription factor binding profiles
- Sandelin
- 2004
(Show Context)
Citation Context ...o the GC content. The motif features are extracted from a compiled set of motifs, each paramet rized by a PWM. The compiled set includes known motifs from TF databases such as TRANSFAC [46] or JASPAR =-=[37]-=-, and new motifs found from the positive ChIP sequences in the data set of interest using a de novo motif search tool. ZL fit a segment-wise homogeneous first-order Markov chain as the background sequ... |

283 |
Regulatory element detection using correlation with expression
- Bussemaker, Li, et al.
- 2001
(Show Context)
Citation Context ...for more details. 5 Motif learning on ChIP-chip data In recent years, a number of computational approaches have been developed to combine motif discovery with gene expression or ChIP-chip data, e.g., =-=[7, 9, 10, 20]-=-. These approaches identify a group of motifs, and then correlate expression values (or ChIP-intensity) to the identified motifs via linear or other regression techniques. The use of ChIP-chip data ha... |

268 | BioProspector: discovering conserved DNA motifs in upstream regulatory regions of coexpressed genes
- Liu, Brutlag, et al.
- 2001
(Show Context)
Citation Context ...g and sensitizing the basic model to reflect realistic biological phenomena, such as multiple motif types in the same sequence, “gapped” motifs, and clustering of motif sites (cis-regulatory modules) =-=[30, 51, 17]-=-, and (ii) using auxiliary data sources, such as gene expression microarrays, ChIP-chip data, phylogenetic information and the physical structure of DNAChapter 8 Regulatory Motif Discovery: from Deco... |

215 |
TRANSFAC: an integrated system for gene expression regulation. Nucleic Acids Res
- Wingender, Chen, et al.
- 2000
(Show Context)
Citation Context ...is equivalent to the GC content. The motif features are extracted from a compiled set of motifs, each paramet rized by a PWM. The compiled set includes known motifs from TF databases such as TRANSFAC =-=[46]-=- or JASPAR [37], and new motifs found from the positive ChIP sequences in the data set of interest using a de novo motif search tool. ZL fit a segment-wise homogeneous first-order Markov chain as the ... |

207 | Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes - Liu, Wong, et al. - 1994 |

161 |
Hidden Markov models of biological primary sequence information
- Baldi, Chauvin, et al.
- 1994
(Show Context)
Citation Context ...ereafter) to sample from the joint posterior distribution of all the unknown parameters and missing data. To consider the uncertainty in multiple alignment, they adopt an HMM-based multiple alignment =-=[3, 22]-=- conditional on the current parameter values. This is achieved by adding a Metropolis-Hastings step in the Gibbs sampler to update these alignments dynamically according to the current sampled paramet... |

140 |
Core transcriptional regulatory circuitry in human embryonic stem cells
- Boyer, Lee, et al.
- 2005
(Show Context)
Citation Context ...ayesian model average. 5.3 Application to human ChIP-chip data Zhou and Liu [50] applied BART to two recently published ChIP-chip data sets of the TFs Oct4 and Sox2 in human embryonic stem (ES) cells =-=[5]-=-. The performance of BART was compared with those of linear regressions [9], MARS [14, 10], and neural networks, respectively, based on ten-fold cross validations. The DNA microarray used in [5] cover... |

121 |
Integrating regulatory motif discovery and genome-wide expression analysis
- Conlon, Liu, et al.
- 2003
(Show Context)
Citation Context ... data sources, such as gene expression microarrays, ChIP-chip data, phylogenetic information and the physical structure of DNAChapter 8 Regulatory Motif Discovery: from Decoding to Meta-Analysis 181 =-=[9, 21, 52, 18]-=-. In the following section we will discuss the general framework of de-novo methods for discovering uncharacterized motifs in biological sequences, focusing especially on the Bayesian approach. 2 A Ba... |

120 |
Bayesian models for multiple local sequence alignment and Gibbs sampling strategies
- Liu, Neuwald, et al.
- 1995
(Show Context)
Citation Context ...algorithm [11] based on a missing-data formulation [24], and a Gibbs sampling algorithm [23]. Later generalizations that allowed for a variable number of motif sites per sequence were a Gibbs sampler =-=[28, 33]-=- and an EM algorithm for finite mixture models [2]. Another class of methods approach the motif discovery problem from a “segmentation” perspective. MobyDick [6] treats the motifs as “words” used by n... |

119 |
Identifying protein-binding sites from unaligned DNA fragments
- Stormo, Hartzell
- 1989
(Show Context)
Citation Context ...lings” can be tolerated. Thus, one also needs a probabilistic model to describe a fuzzy word. An early motif-finding approach was CONSENSUS, an information theorybased progressive alignment procedure =-=[42]-=-. Other methods included an EMalgorithm [11] based on a missing-data formulation [24], and a Gibbs sampling algorithm [23]. Later generalizations that allowed for a variable number of motif sites per ... |

116 | Gibbs motif sampling: detection of bacterial outer membrane protein repeats
- Neuwald, Liu, et al.
- 1995
(Show Context)
Citation Context ...algorithm [11] based on a missing-data formulation [24], and a Gibbs sampling algorithm [23]. Later generalizations that allowed for a variable number of motif sites per sequence were a Gibbs sampler =-=[28, 33]-=- and an EM algorithm for finite mixture models [2]. Another class of methods approach the motif discovery problem from a “segmentation” perspective. MobyDick [6] treats the motifs as “words” used by n... |

107 |
Combining phylogenetic data with coregulated genes to identify regulatory motifs
- Wang, Stormo
- 2003
(Show Context)
Citation Context ...allows the use of information from the evolutionary conservation of TFBS’s in related species. Several recent methods employ such information to enhance the power of cis-regulatroy analysis. PhyloCon =-=[45]-=- builds multiple alignments among orthologs and extends these alignments to identify motif profiles. CompareProspector [31] biases motif search to more conserved regions based on conservation scores. ... |

100 |
Genome-scale identification of nucleosome positions in S. cerevisiae
- Yuan, Liu, et al.
- 2005
(Show Context)
Citation Context ...ional space. Biological evidence [32] shows that much of DNA consists of repeats of regions of about 147 bp wrapped around nucleosomes, separated by stretches of DNA called linkers. Recent techniques =-=[47]-=- based on high density genome tiling arrays have been used to experimentally measure genomic positions of nucleosomes, in which the measurement “intensities” indicate how likely that locus is to be nu... |

98 |
Discovery of novel transcription factor binding sites by statistical overrepresentation
- Sinha, Tompa
- 2002
(Show Context)
Citation Context ...n been observed to occur in spatial clusters, or cis-regulatory modules (Figure 1). One approach to locating cis-regulatory modules (CRMs) is by predicting novel motifs and looking for co-occurrences =-=[41]-=-. However, since individual motifs in the cluster may not be well-conserved, such an approach often leads to a large number of false negatives. Here, we describe a strategy to first use existing de no... |

90 | Modeling dependencies in protein-DNA binding sites
- Barash, Elidan, et al.
- 2003
(Show Context)
Citation Context ...s or adds a pair of correlated column at each iteration. Other proposed models are a Bayesian treelike network modeling the possible correlation structure among all the positions within a motif model =-=[4]-=-, and a permuted Markov model in which the assumption is that an unobserved permutation has acted on the positions of all the motif sites and that the original ordered positions can be described by a ... |

90 | PhyloGibbs: a Gibbs sampling motif finder that incorporates phylogeny
- Siddharthan, Siggia, et al.
- 2005
(Show Context)
Citation Context ... CompareProspector [31] biases motif search to more conserved regions based on conservation scores. With a given alignment of orthologs and a phylogenetic tree, EMnEM [34], PhyME [40], and PhyloGibbs =-=[39]-=- detect motifs based on more comprehensive evolutionary models for TFBS’s. When evolutionary distances among the genomes are too large for the orthologous sequences to be reliably aligned, Li and Wong... |

85 |
PhyME: a probabilistic algorithm for finding motifs in sets of orthologous sequences
- Sinha, Blanchette, et al.
- 2004
(Show Context)
Citation Context ...ntify motif profiles. CompareProspector [31] biases motif search to more conserved regions based on conservation scores. With a given alignment of orthologs and a phylogenetic tree, EMnEM [34], PhyME =-=[40]-=-, and PhyloGibbs [39] detect motifs based on more comprehensive evolutionary models for TFBS’s. When evolutionary distances among the genomes are too large for the orthologous sequences to be reliably... |

75 |
Building a dictionary for genomes: identification of presumptive regulatory sites by statistical analysis
- Bussemaker, Li, et al.
(Show Context)
Citation Context ...per sequence were a Gibbs sampler [28, 33] and an EM algorithm for finite mixture models [2]. Another class of methods approach the motif discovery problem from a “segmentation” perspective. MobyDick =-=[6]-=- treats the motifs as “words” used by nature to construct the “sentences” of DNA and estimates word frequencies using a Newton-Raphson optimization procedure. The dictionary model was later extended t... |

69 |
De novo cis-regulatory module elicitation for eukaryotic genomes
- Gupta, Liu
- 2005
(Show Context)
Citation Context ...g and sensitizing the basic model to reflect realistic biological phenomena, such as multiple motif types in the same sequence, “gapped” motifs, and clustering of motif sites (cis-regulatory modules) =-=[30, 51, 17]-=-, and (ii) using auxiliary data sources, such as gene expression microarrays, ChIP-chip data, phylogenetic information and the physical structure of DNAChapter 8 Regulatory Motif Discovery: from Deco... |

56 |
Bayesian inference on biopolymer models
- Liu, Lawrence
- 1999
(Show Context)
Citation Context ...otifs found from the positive ChIP sequences in the data set of interest using a de novo motif search tool. ZL fit a segment-wise homogeneous first-order Markov chain as the background sequence model =-=[27]-=-, which helps to account for the heterogeneous nature of genomic sequences such as regions of low complexities (eg. GC/AT rich). Intuitively, this model assumes that the sequence in consideration can ... |

55 | CisModule: de novo discovery of cis-regulatory modules by hierarchical mixture modeling - Zhou, Wong - 2004 |

47 |
Modeling within-motif dependence for transcription factor binding site predictions
- Zhou, Liu
- 2004
(Show Context)
Citation Context ...at all columns of a weight matrix are independent– however, it has been observed that about 25% of experimentally validated motifs show statistically significant positional correlations. Zhou and Liu =-=[49]-=- extend the independent weight matrix model to including one or more correlated column pairs, under the restriction that no two pairs of correlated columns can share a column in common. A MetropolisHa... |

46 | Phylogenetic motif detection by expectation-maximization on evolutionary mixtures
- Moses, Chiang, et al.
- 2004
(Show Context)
Citation Context ...ments to identify motif profiles. CompareProspector [31] biases motif search to more conserved regions based on conservation scores. With a given alignment of orthologs and a phylogenetic tree, EMnEM =-=[34]-=-, PhyME [40], and PhyloGibbs [39] detect motifs based on more comprehensive evolutionary models for TFBS’s. When evolutionary distances among the genomes are too large for the orthologous sequences to... |

44 | Eukaryotic regulatory element conservation analysis and identification using comparative genomics
- Liu, Liu, et al.
- 2004
(Show Context)
Citation Context ...y such information to enhance the power of cis-regulatroy analysis. PhyloCon [45] builds multiple alignments among orthologs and extends these alignments to identify motif profiles. CompareProspector =-=[31]-=- biases motif search to more conserved regions based on conservation scores. With a given alignment of orthologs and a phylogenetic tree, EMnEM [34], PhyME [40], and PhyloGibbs [39] detect motifs base... |

41 |
Identification of regulatory elements using a feature selection method
- Keles, Laan, et al.
- 2002
(Show Context)
Citation Context ...for more details. 5 Motif learning on ChIP-chip data In recent years, a number of computational approaches have been developed to combine motif discovery with gene expression or ChIP-chip data, e.g., =-=[7, 9, 10, 20]-=-. These approaches identify a group of motifs, and then correlate expression values (or ChIP-intensity) to the identified motifs via linear or other regression techniques. The use of ChIP-chip data ha... |

33 |
Algorithms for the optimal identification of segments neighborhoods
- Auger, Lawrence
- 1989
(Show Context)
Citation Context ...0 | . Under the model specified above, it is also possible to implement a “partitionbased” data augmentation (DA) approach [16] that is motivated by the recursive algorithm used in Auger and Lawrence =-=[1]-=-. The DA approach samples A jointly according to the conditional distribution P(A | Θ, S) = N∏ i=1 Li−1 ∏ P(AiLi | Θ, S) P(Aij|Ai,j+1, · · · , AiLi, S,Θ). j=1 At a position j, the current knowledge of... |

32 | Interacting models of cooperative gene regulation
- Das, Banerjee, et al.
- 2004
(Show Context)
Citation Context ...for more details. 5 Motif learning on ChIP-chip data In recent years, a number of computational approaches have been developed to combine motif discovery with gene expression or ChIP-chip data, e.g., =-=[7, 9, 10, 20]-=-. These approaches identify a group of motifs, and then correlate expression values (or ChIP-intensity) to the identified motifs via linear or other regression techniques. The use of ChIP-chip data ha... |

27 | Evolutionary Monte Carlo: applications to cp model sampling and change point problem. Statistica Sinica
- Liang, Wong
- 2000
(Show Context)
Citation Context ...rameters including PWMs. 3.1.1 Evolutionary Monte Carlo for module selection It has been demonstrated that the EMC method is effective for sampling and optimization with functions of binary variables =-=[26]-=-. Conceptually, we should be able to apply EMC directly to select motifs comprising the CRM, but a complication here is that there are many continuous parameters such as the Θj’s, λ, and τ that vary i... |

25 |
Discovery of conserved sequence patterns using a stochastic dictionary model
- Gupta, Liu
- 2003
(Show Context)
Citation Context ...d estimates word frequencies using a Newton-Raphson optimization procedure. The dictionary model was later extended to include “stochastic” words in order to account for variations in the motif sites =-=[16, 36]-=- and a data augmentation (DA) [43] procedure introduced for finding such words. Recent approaches to motif discovery have improved upon the previous methods in at least two primary ways: (i) improving... |

24 |
Combining phylogenetic motif discovery and motif clustering to predict co-regulated genes
- Jensen
- 2005
(Show Context)
Citation Context ...e too large for the orthologous sequences to be reliably aligned, Li and Wong [25] proposed an ortholog sampler that finds motifs in multiple species independent of ortholog alignments. Jensen et al. =-=[19]-=- used a Bayesian clustering approach to combine TF binding motifs from promoters of multiple orthologs. Table 1: Error rates for module prediction methods. Method MEF MYF SP1 SRF Total SENS SPEC TSpec... |

24 |
Finding short DNA motifs using permuted markov models
- Zhao, Huang, et al.
- 2005
(Show Context)
Citation Context ...ted Markov model in which the assumption is that an unobserved permutation has acted on the positions of all the motif sites and that the original ordered positions can be described by a Markov chain =-=[48]-=-. Mathematically, the model [49] is a sub-case of [48], which is, in turn, a sub-case of [4]. 3 Discovery of regulatory modules Motif predictions for higher eukaryotic genomes are more challenging tha... |

23 |
Sampling motifs on phylogenetic trees
- Li, Wong
- 2005
(Show Context)
Citation Context ...detect motifs based on more comprehensive evolutionary models for TFBS’s. When evolutionary distances among the genomes are too large for the orthologous sequences to be reliably aligned, Li and Wong =-=[25]-=- proposed an ortholog sampler that finds motifs in multiple species independent of ortholog alignments. Jensen et al. [19] used a Bayesian clustering approach to combine TF binding motifs from promote... |

19 | Bart: Bayesian additive regression trees
- Chipman, George, et al.
- 2010
(Show Context)
Citation Context ...frequencies, and a set of motif scores derived from both known motifs documented in biological databases and motifs discovered de novo. Then, they apply the Bayesian additive regression trees (BARTs) =-=[8]-=- to learn the relationship between ChIP-intensity and these sequence features. As the sum of a set of trees, the BART model is flexible enough to approximate almost any complex relationship between re... |

17 |
Reversible jump MCMC and Bayesian model determination. Biometrika, 82:711{732. Motif Discovery 23
- Green
- 1995
(Show Context)
Citation Context ...er. Jointly sampling from the posterior distribution of (A, Θ, w) is difficult as the dimensionality of Θ changes with w. One way to update (w, Θ) jointly would be through a reversible jump procedure =-=[15]-=-. However, note that we can integrate out Θ from the posterior distribution to avoid a dimensionality change during the updating. By placing an appropriate prior distribution p(w) on w (a possible cho... |

17 | Dynamic nucleosomes
- Luger
- 2006
(Show Context)
Citation Context ...me positioning information in motif discovery Generally TF-DNA binding is represented as a one-dimensional process; however, in reality, binding occurs in three dimensional space. Biological evidence =-=[32]-=- shows that much of DNA consists of repeats of regions of about 147 bp wrapped around nucleosomes, separated by stretches of DNA called linkers. Recent techniques [47] based on high density genome til... |

11 |
Decoding human regulatory circuits. Genome Research
- Thompson, Palumbo, et al.
- 2004
(Show Context)
Citation Context ... j=1 Aijk. If we have reason to believe that motif occurrences are not independent, but occur as clusters (as in regulatory modules), we can instead adopt a prior Markovian model for motif occurrence =-=[17, 44]-=- which is discussed further in Section 3. 2.1 Markov chain Monte Carlo computation Under the model described in (2.1), it is straightforward to implement a Gibbs sampling (GS) scheme to iteratively up... |

10 |
An expectation-maximization (EM) algorithm for the identi and characterization of common sites in biopolymer sequences
- Lawrence, Reilly
- 1990
(Show Context)
Citation Context ...zy word. An early motif-finding approach was CONSENSUS, an information theorybased progressive alignment procedure [42]. Other methods included an EMalgorithm [11] based on a missing-data formulation =-=[24]-=-, and a Gibbs sampling algorithm [23]. Later generalizations that allowed for a variable number of motif sites per sequence were a Gibbs sampler [28, 33] and an EM algorithm for finite mixture models ... |

5 | Coupling hidden Markov models for the discovery of cis-regulatory modules in multiple species
- Zhou, Wong
- 2007
(Show Context)
Citation Context ... data sources, such as gene expression microarrays, ChIP-chip data, phylogenetic information and the physical structure of DNAChapter 8 Regulatory Motif Discovery: from Decoding to Meta-Analysis 181 =-=[9, 21, 52, 18]-=-. In the following section we will discuss the general framework of de-novo methods for discovering uncharacterized motifs in biological sequences, focusing especially on the Bayesian approach. 2 A Ba... |

4 | JS: Extracting sequence features to predict protein-DNA interactions: a comparative study
- Zhou, Liu
(Show Context)
Citation Context ...h resolution TF binding regions, but also give quantitative measures of the binding activity (ChIP-enrichment) for such regions. In this section, we introduce a new approach developed by Zhou and Liu =-=[50]-=- (ZL hereafter) for motif learning from ChIP-chip data, to illustrate the general framework of this type of methods. In contrast to many approaches that directly build generative statistical models in... |