Results 1 - 10
of
12
Mining Protein Contact Maps
, 2002
"... in a symmetrical, square, boolean matrix of pairwise, inter-residue contacts, or "contact map". The contact map provides a host of useful information about the protein's structure. In this paper we describe how data mining can be used to extract valuable information from contact maps. For example, c ..."
Abstract
-
Cited by 17 (3 self)
- Add to MetaCart
in a symmetrical, square, boolean matrix of pairwise, inter-residue contacts, or "contact map". The contact map provides a host of useful information about the protein's structure. In this paper we describe how data mining can be used to extract valuable information from contact maps. For example, clusters of contacts represent certain secondary structures, and also capture non-local interactions, giving clues to the tertiary structure.
Prediction of Contact Maps Using Support Vector Machines
- In Proc. of the IEEE Symposium on BioInformatics and BioEngineering
, 2003
"... Contact map prediction is of great interest for its application in fold recognition and protein 3D structure determination. In this paper we present a contact-map prediction algorithm that employs Support Vector Machines as the machine learning tool and incorporates various features such as sequence ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
Contact map prediction is of great interest for its application in fold recognition and protein 3D structure determination. In this paper we present a contact-map prediction algorithm that employs Support Vector Machines as the machine learning tool and incorporates various features such as sequence profiles and their conservation, correlated mutation analysis based on various amino acid physicochemical properties, and secondary structure. In addition, we evaluated the effectiveness of the different features on contact map prediction for different fold classes. On average, our predictor achieved a prediction accuracy of 0.2238 with an improvement over a random predictor of a factor 11.7, which is better than reported studies. Our study showed that predicted secondary structure features play an important roles for the proteins containing beta structures. Models based on secondary structure features and CMA features produce different sets of predictions. Our study also suggests that models learned separately for different protein fold families may achieve better performance than a unified model.
A bi-recursive neural network architecture for the prediction of protein coarse contact maps
- In 1st IEEE Computer Society Bioinformatics Conference (CSB’02
, 2002
"... Prediction of contact maps may be seen as a strategic step towards the solution of fundamental open problems in structural genomics. In this paper we focus on coarse grained maps that describe the spatial neighborhood relation between secondary structure elements (helices, strands, and coils) of a p ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Prediction of contact maps may be seen as a strategic step towards the solution of fundamental open problems in structural genomics. In this paper we focus on coarse grained maps that describe the spatial neighborhood relation between secondary structure elements (helices, strands, and coils) of a protein. We introduce a new machine learning approach for scoring candidate contact maps. The method combines a specialized noncausal recursive connectionist architecture and a heuristic graph search algorithm. The network is trained using candidate graphs generated during search. We show how the process of selecting and generating training examples is important for tuning the precision of the predictor.
Predicting ranked scop domains by mining associations of visual contents in distance matrices
- in Proc. of The Fourth Asia Pacific Bioinformatics Conference, 2006
"... ..."
A fast scop fold classification system using content-based e-predict algorithm
- BMC Bioinformatics
, 2006
"... Background: Domain experts manually construct the Structural Classification of Protein (SCOP) Database to categorize and compare protein structures. Even though using the SCOP database is believed to be more reliable than classification results from other methods, it is labor intensive. To mimic hum ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Background: Domain experts manually construct the Structural Classification of Protein (SCOP) Database to categorize and compare protein structures. Even though using the SCOP database is believed to be more reliable than classification results from other methods, it is labor intensive. To mimic human classification processes, we develop an automatic SCOP fold classification system to assign possible known SCOP folds and recognize novel folds for newly-discovered proteins. Results: With a sufficient amount of ground truth data, our system is able to assign the known folds for newly-discovered proteins in the latest SCOP v1.69 release with 92.17 % accuracy. Our system also recognizes the novel folds with 89.27 % accuracy using 10 fold cross validation. The average response time for proteins with 500 and 1409 amino acids to complete the classification process is 4.1 and 17.4 seconds, respectively. By comparison with several structural alignment algorithms, our approach outperforms previous methods on both the classification accuracy and efficiency. Conclusions: In this paper, we build an advanced, non-parametric classifier to accelerate the manual classification processes of SCOP. With satisfactory ground truth data from the SCOP database, our approach identifies relevant domain knowledge and yields reasonably accurate classifications. Our system is publicly accessible at
Machine Learning in Structural Genomics
"... Proteins are polymer chains composed of twenty simpler molecules, called amino acids, that carry out most of the molecular functions in living organisms. Although a protein can be first characterized by its amino acid sequence, or primary sequence, most proteins fold into three-dimensional ..."
Abstract
- Add to MetaCart
Proteins are polymer chains composed of twenty simpler molecules, called amino acids, that carry out most of the molecular functions in living organisms. Although a protein can be first characterized by its amino acid sequence, or primary sequence, most proteins fold into three-dimensional
Mining Protein Contact Maps
- In The 3rd ACM SIGKDD Workshop on Data Mining in Bioinformatics (BIOKDD
, 2003
"... We discuss some novel mining tasks for protein contact maps (two dimensional representations of the three dimensional structure of proteins). We show that using contact maps and a hybrid mining approach, we can construct "contact rules" to predict the structure of an unknown protein. Furthermore, ..."
Abstract
- Add to MetaCart
We discuss some novel mining tasks for protein contact maps (two dimensional representations of the three dimensional structure of proteins). We show that using contact maps and a hybrid mining approach, we can construct "contact rules" to predict the structure of an unknown protein. Furthermore, we mine a model that discriminates physical from non-physical maps using frequent dense patterns and heuristic rules of physicality.
Mining Protein Contact Maps
, 2002
"... in a symmetrical, square, boolean matrix of pairwise, inter-residue contacts, or "contact map". The contact map provides a host of useful information about the protein's structure. In this paper we describe how data mining can be used to extract valuable information from contact maps. For example, c ..."
Abstract
- Add to MetaCart
in a symmetrical, square, boolean matrix of pairwise, inter-residue contacts, or "contact map". The contact map provides a host of useful information about the protein's structure. In this paper we describe how data mining can be used to extract valuable information from contact maps. For example, clusters of contacts represent certain secondary structures, and also capture non-local interactions, giving clues to the tertiary structure.

