Results 1 - 10
of
20
Hmmstr: a hidden markov model for local sequence-structure correlations in proteins
- Journal of Molecular Biology
, 2000
"... *Corresponding authors ..."
Prediction of local structure in proteins using a library of sequence-structure motifs
- J. MOL. BIOL
, 1998
"... ..."
Efficient Remote Homology Detection Using Local Structure
- BIOINFORMATICS
, 2003
"... Motivation: The function of an unknown biological sequence can often be accurately inferred if we are able to map this unknown sequence to its corresponding homologous family. At present, discriminative methods such as SVM-Fisher and SVM-pairwise, which combine support vector machine and sequence si ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
Motivation: The function of an unknown biological sequence can often be accurately inferred if we are able to map this unknown sequence to its corresponding homologous family. At present, discriminative methods such as SVM-Fisher and SVM-pairwise, which combine support vector machine and sequence similarity, are recognized as the most accurate methods, with SVM-pairwise being the most accurate. However, these methods typically encode sequence information into their feature vectors and ignore the structure information. They are also computationally inefficient. Based on these observations, we present an alternative method for SVM-based protein classification. Our proposed method, SVM-I-sites, utilizes structure similarity for remote homology detection. Result:
Mining residue contacts in proteins using local structure predictions
- In IEEE Int. Symposium on Bioinformatics and Biomedical Engineering
, 2000
"... In this paper we develop data mining techniques to predict 3D contact potentials among protein residues (or amino acids) based on the hierarchical nucleationpropagation model of protein folding. We apply a hybrid approach, using a Hidden Markov Model to extract folding initiation sites, and then app ..."
Abstract
-
Cited by 17 (7 self)
- Add to MetaCart
In this paper we develop data mining techniques to predict 3D contact potentials among protein residues (or amino acids) based on the hierarchical nucleationpropagation model of protein folding. We apply a hybrid approach, using a Hidden Markov Model to extract folding initiation sites, and then apply association mining to discover contact potentials. The new hybrid approach achieves accuracy results better than those reported previously. 1
Striped sheets and protein contact prediction
- Bioinformatics
, 2004
"... Accepted for oral presentation at ISMB/ECCB 2004 ..."
Blind Predictions of Local Protein Structure in CASP2 Targets Using the I-Sites Library
- Proteins: Structure, Function and Genetics, Suppl
, 1997
"... Blind predictions of the local structure of nine CASP2 targets were made using the I-sites library of short sequence--- structure motifs, revealing strengths and weaknesses in this new knowledge-based method. Many turns between secondary structural elements were accurately predicted. Estimates of th ..."
Abstract
-
Cited by 13 (5 self)
- Add to MetaCart
Blind predictions of the local structure of nine CASP2 targets were made using the I-sites library of short sequence--- structure motifs, revealing strengths and weaknesses in this new knowledge-based method. Many turns between secondary structural elements were accurately predicted. Estimates of the confidence of prediction correlated well with the accuracy over the whole set. Bias toward structures used to develop the library was minimal, probably because of the extensive use of cross-validation. However, helix positions were better predicted by the PHD program. The method is likely to be sensitive to the quality of the sequence alignment. A general measure for evaluating local structure predictions is suggested. Proteins, Suppl. 1:167-- 171, 1997. r 1998 Wiley-Liss, Inc. Key words: sequence profiles building-blocks; secondary helix; strand turn knowledge-based
Detection of Protein Coding Sequences Using a Mixture Model for Local Protein Amino Acid Sequence
- BIOKDD01: Workshop on Data Mining in Bioinformatics (with SIGKDD01 Conference
, 2000
"... Locating protein coding regions in genomic DNA is a critical step in accessing the information generated by large scale sequencing projects. Current methods for gene detection depend on statistical measures of content differences between coding and noncoding DNA in addition to the recognition of pro ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Locating protein coding regions in genomic DNA is a critical step in accessing the information generated by large scale sequencing projects. Current methods for gene detection depend on statistical measures of content differences between coding and noncoding DNA in addition to the recognition of promoters, splice sites, and other regulatory sites. Here we explore the potential value of recurrent amino acid sequence patterns 3-19 amino acids in length as a content statistic for use in gene finding approaches. A finite mixture model incorporating these patterns can partially discriminate protein sequences which have no (detectable) known homologs from randomized versions of these sequences, and from short ( 50 amino acids) non-coding segments extracted from the S. cerevisiea genome. The mixture model derived scores for a collection of human exons were not correlated with the GENSCAN scores, suggesting that the addition of our protein pattern recognition module to current gene recognition programs may improve their performance.
A Unified Sequence-Structure Classification of Protein Sequences: Combining Sequence and Structure in a Map of the Protein Space.
, 2000
"... We analyze all known protein sequences in search for a global map of protein space that is consistent in terms of both sequence and structure. Our goal is to dene clusters of homologous protein domains, beyond those detected by sequence-based methods alone, and then to build a three-dimensional (3D) ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
We analyze all known protein sequences in search for a global map of protein space that is consistent in terms of both sequence and structure. Our goal is to dene clusters of homologous protein domains, beyond those detected by sequence-based methods alone, and then to build a three-dimensional (3D) model for each of the sequences that are homologous to sequences of known 3D structure. This analysis uses both sequence and structure based metrics in the analysis of all protein sequences in a non-redundant (NR) database, comprising all major sequence databases. The analysis starts from the sequences of the SCOP database domains, which have known three-dimensional structures. These sequences are clustered rst into families based on sequence similarity alone, without incorporating any information from the SCOP classication. Each sequence-based family is represented by a prole, and this prole is used to search the NR database, using PSI-BLAST. Since PSI-BLAST can lead to false similar...
VOT 74017 PROTEIN SECONDARY STRUCTURE PREDICTION FROM AMINO ACID SEQUENCE USING ARTIFICIAL INTELLIGENCE TECHNIQUE
, 2007
"... Large genome sequencing projects generate huge number of protein sequences in their primary structures that is difficult for conventional biological techniques to determine their corresponding 3D structures and then their functions. Protein secondary structure prediction is a prerequisite step in de ..."
Abstract
- Add to MetaCart
Large genome sequencing projects generate huge number of protein sequences in their primary structures that is difficult for conventional biological techniques to determine their corresponding 3D structures and then their functions. Protein secondary structure prediction is a prerequisite step in determining the 3D structure of a protein. In this research a method for prediction of protein secondary structure has been proposed and implemented together with other known accurate methods in this domain. The method has been discussed and presented in a comparative analysis progression to allow easy comparison and clear conclusions. A benchmark data set is exploited in training and testing the methods under the same hardware, platforms, and environments. The newly developed method utilizes the knowledge of the GORV information theory and the power of the neural network to classify a novel protein sequence in one of its three secondary structures classes. NN-GORV-I is developed and implemented to predict proteins secondary structure using the biological information conserved in neighboring residues and related
FOR THE RECORD Three-dimensional structures and contexts associated
, 1997
"... Three-dimensional structures and contexts associated with recurrent amino acid sequence patterns ..."
Abstract
- Add to MetaCart
Three-dimensional structures and contexts associated with recurrent amino acid sequence patterns

