Results 1 -
8 of
8
Integrating Genomic Homology into Gene Structure Prediction
, 2001
"... TWINSCAN is a new gene-structure prediction system that directly extends the probability model of GENSCAN, allowing it to exploit homology between two related genomes. Separate probability models are used for conservation in exons, introns, splice sites, and UTRs, reflecting the differences among th ..."
Abstract
-
Cited by 137 (6 self)
- Add to MetaCart
TWINSCAN is a new gene-structure prediction system that directly extends the probability model of GENSCAN, allowing it to exploit homology between two related genomes. Separate probability models are used for conservation in exons, introns, splice sites, and UTRs, reflecting the differences among their patterns of evolutionary conservation. TWINSCAN is specifically designed for the analysis of high-throughput genomic sequences containing an unknown number of genes. In experiments on high-throughput mouse sequences, using homologous sequences from the human genome, TWINSCAN shows notable improvement over GENSCAN in exon sensitivity and specificity and dramatic improvement in exact gene sensitivity and specificity. This improvement can be attributed entirely to modeling the patterns of evolutionary conservation in genomic sequence.
The Conserved Exon Method for Gene Finding
, 2000
"... A new approach to gene finding is introduced called the "Conserved Exon Method" (CEM). It is based on the idea of looking for conserved protein sequences by comparing pairs of DNA sequences, identifying putative exon pairs based on conserved regions and splice junction signals then chaining pairs of ..."
Abstract
-
Cited by 33 (0 self)
- Add to MetaCart
A new approach to gene finding is introduced called the "Conserved Exon Method" (CEM). It is based on the idea of looking for conserved protein sequences by comparing pairs of DNA sequences, identifying putative exon pairs based on conserved regions and splice junction signals then chaining pairs of putative exons together. It simultaneously predicts gene structures in both human and mouse genomic sequences (or in other pairs of sequences at the appropriate evolutionary distance). Experimental results indicate the potential usefulness of this approach.
Comparative Genome Analysis Delimits a Chromosomal Domain and Identifies Key Regulatory Elements in the ... Globin Cluster
, 2001
"... corresponding region in man. This has defined a small segment (135--155 kb) of synteny and conserved gene order, which may contain all of the elements required to fully regulate globin gene expression from its natural chromosomal environment. Comparing human and mouse sequences using previously des ..."
Abstract
-
Cited by 14 (2 self)
- Add to MetaCart
corresponding region in man. This has defined a small segment (135--155 kb) of synteny and conserved gene order, which may contain all of the elements required to fully regulate globin gene expression from its natural chromosomal environment. Comparing human and mouse sequences using previously described methods failed to identify the known regulatory elements. However, refining these methods by ranking identity scores of noncoding sequences, we found conserved sequences including the previously characterized globin major regulatory element. In chicken and pufferfish, regions that may correspond to this element were found by analysing the distribution of transcription factor binding sites. Regions identified in this way act as strong enhancer elements in expression assays. In addition to delimiting the globin chromosomal domain, this study has enabled us to develop a more sensitive and accurate routine for identifyi
ES: Methods in comparative genomics: genome correspondence, gene identification and regulatory motif discovery
- J Comput Biol
, 2004
"... In Kellis et al. (2003), we reported the genome sequences of S. paradoxus, S. mikatae and S. bayanus and compared these three yeast species to their close relative, S. cerevisiae. Genome-wide comparative analysis allowed the identification of functionally important sequences, both coding and non-cod ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In Kellis et al. (2003), we reported the genome sequences of S. paradoxus, S. mikatae and S. bayanus and compared these three yeast species to their close relative, S. cerevisiae. Genome-wide comparative analysis allowed the identification of functionally important sequences, both coding and non-coding. In this companion paper we describe the mathematical and algorithmic results underpinning the analysis of these genomes. We present methods for the automatic determination of genome correspondence. The algorithms enabled the automatic identification of orthologs for more than 90 % of genes and intergenic regions across the four species despite the large number of duplicated genes in the yeast genome. The remaining ambiguities in the gene correspondence revealed recent gene family expansions in regions of rapid genomic change. We present methods for the identification of protein-coding genes based on their patterns of nucleotide conservation across related species. We observed the pressure to conserve the reading frame of functional proteins and developed a test for gene identification with high sensitivity and specificity. We used this test to revisit the genome of S. cerevisiae, reducing the overall gene count by 500 genes (10 % of previously annotated genes) and refining the gene structure of hundreds of genes. We present novel methods for the systematic de novo identification of regulatory motifs. The methods do not rely on previous knowledge of gene function and in that way differ from the current literature on computational motif discovery. Based
Btk expression is controlled by Oct and BOB.1/OBF.1
, 2005
"... BOB.1/OBF.1 is a lymphocyte-restricted transcriptional coactivator. It binds together with the Oct1 and Oct2 transcription factors to DNA and enhances their transactivation potential. Mice deficient for the transcriptional coactivator BOB.1/OBF.1 show several defects in differentiation, function and ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
BOB.1/OBF.1 is a lymphocyte-restricted transcriptional coactivator. It binds together with the Oct1 and Oct2 transcription factors to DNA and enhances their transactivation potential. Mice deficient for the transcriptional coactivator BOB.1/OBF.1 show several defects in differentiation, function and signaling of B cells. In search of BOB.1/OBF.1 regulated genes we identified Btk—a cytoplasmic tyrosine kinase—as a direct target of BOB.1/OBF.1. Analyses of the human as well as murine Btk promoters revealed a nonconsensus octamer site close to the start site of transcription. Here we show that Oct proteins together with BOB.1/OBF.1 are able to form ternary complexes on these sites in vitro and in vivo. This in turn leads to the induction of Btk promoter activity in synergism with the transcription factor PU.1. Btk, like BOB.1/ OBF.1, plays a critical role in B cell development and B cell receptor signalling. Therefore the downregulation of Btk expression in BOB.1/OBF.1deficient B cells could be related to the functional and developmental defects observed in BOB.1/ OBF.1-deficient mice.
Mammalian Genomes Ease the Location of Human Transcription Factor Binding Sites but Do Not Ease Their Description Abstract
, 2004
"... Comparisons of multiple related genomes have already produced a number of interesting findings, and sequencing resources are available to obtain the genomes of many more species. For studies of human disease, there is naturally a strong interest in the genomes of vertebrates, especially mammals. Dec ..."
Abstract
- Add to MetaCart
Comparisons of multiple related genomes have already produced a number of interesting findings, and sequencing resources are available to obtain the genomes of many more species. For studies of human disease, there is naturally a strong interest in the genomes of vertebrates, especially mammals. Decisions concerning the particular species to sequence depend on a number of important factors. While much useful and constructive discussion about these choices has ensued, there have been few quantitative analyses addressing this issue. Here we consider two of these factors: 1) pattern discovery of functional elements, such as transcription factor binding site models, and 2) identification of unusually conserved sequence fragments. To address these issues, we examined data from seven mammals (dog, cow, pig, rat, cat, baboon, and chimpanzee) which are being sequenced in the NISC Comparative Sequencing Program. We find that, taken together, the data from human, mouse, and the seven additional mammals are only 1.5 times as effective for pattern identification as the data from human and mouse alone. Contrastingly, they are 3.5 times as effective for identification of conserved fragments. For many reasons, the sequencing of these mammalian genomes is, and will continue to be, a valuable endeavor, but our results suggest that its contribution to the identification of the patterns of functional sites in DNA sequence will be limited. Interestingly, our results are less pessimistic about its contribution to the identification of sequence conservation, and they suggest that the availability of additional sequences will contribute significantly to such an endeavor.
unknown title
"... Jareborg et al. piled from the EMBL nucleotide database (Stoesser et al. 1998) release 55, as outlined in Methods. The lower size limit for the mouse genomic sequences was arbitrarily set to 7 kb. The genes range in size from 994 to 41.8 kb for mouse (average length = 7902 bp, S.D. = 6391) and from ..."
Abstract
- Add to MetaCart
Jareborg et al. piled from the EMBL nucleotide database (Stoesser et al. 1998) release 55, as outlined in Methods. The lower size limit for the mouse genomic sequences was arbitrarily set to 7 kb. The genes range in size from 994 to 41.8 kb for mouse (average length = 7902 bp, S.D. = 6391) and from 1148 to 37.7 kb for human (average

