• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Towards a comprehensive collection of diagnostic patterns for protein sequence classification (2002)

by B Olsson, K Laurio
Venue:Information Sciences
Add To MetaCart

Tools

Sorted by:
Results 1 - 3 of 3

2004b) Bio-support vector machines for computational proteomics

by Zheng Rong Yang, Kuo-chen Chou - Bioinformatics
"... One of the most important issues in computational proteomics is to produce a prediction model for the classification or annotation of biological function of novel protein sequences. In order to improve the prediction accuracy, much attention has been paid to the improvement of the performance of the ..."
Abstract - Cited by 2 (2 self) - Add to MetaCart
One of the most important issues in computational proteomics is to produce a prediction model for the classification or annotation of biological function of novel protein sequences. In order to improve the prediction accuracy, much attention has been paid to the improvement of the performance of the algorithms used, few is for solving the fundamental issue, namely, amino acid encoding as most existing pattern recognition algorithms are unable to recognise amino acids in protein sequences. Importantly, the most commonly used amino acid encoding method has the flaw that leads to large computational cost and recognition bias. Result By replacing kernel functions of support vector machines with amino acid similarity measurement matrices, we have modified support vector machines (SVMs), a new type of pattern recognition algorithm for analysing protein sequences, particularly for proteolytic cleavage activity prediction. We refer to the modified SVMs as bio-support vector machine (bSVM). When applied to the prediction of HIV protease cleavage sites, the new method has showed a remarkable advantage in reducing the model complexity and enhancing the model robustness.

unknown title

by unknown authors
"... Motivation: One of the most important issues in computational proteomics is to produce a prediction model for the classification or annotation of biological function of novel protein sequences. In order to improve the prediction accuracy, much attention has been paid to the improvement of the perfor ..."
Abstract - Add to MetaCart
Motivation: One of the most important issues in computational proteomics is to produce a prediction model for the classification or annotation of biological function of novel protein sequences. In order to improve the prediction accuracy, much attention has been paid to the improvement of the performance of the algorithms used, few is for solving the fundamental issue, namely, amino acid encoding as most existing pattern recognition algorithms are unable to recognize amino acids in protein sequences. Importantly, the most commonly used amino acid encoding method has the flaw that leads to large computational cost and recognition bias. Results: By replacing kernel functions of support vector machines (SVMs) with amino acid similarity measurement matrices, we have modified SVMs, a new type of pattern recognition algorithm for analysing protein sequences, particularly for proteolytic cleavage site prediction. We refer to the modified SVMs as bio-support vector machine. When applied to the prediction of HIV protease cleavage sites, the new method has shown a remarkable advantage in reducing the model complexity and enhancing the model robustness. Contact:

A Data Warehouse Approach to Maintenance of Integrated Biological Data

by Henrik Engström, Kjartan Asthorsson
"... A data warehouse can be described as a collection of materialised views over distributed, heterogeneous, and autonomous sources. Although most data warehouse research efforts have been focused on business-oriented decision support, many of the general principles apply to other areas. In this paper w ..."
Abstract - Add to MetaCart
A data warehouse can be described as a collection of materialised views over distributed, heterogeneous, and autonomous sources. Although most data warehouse research efforts have been focused on business-oriented decision support, many of the general principles apply to other areas. In this paper we analyse how previous work on data warehouse maintenance can be applied to the maintenance of biological data collected from web-sources. We have studied the widely used protein sequence database SWISS-PROT and the related classification database PROSITE. The results of this analysis show that these sources, although unsophisticated from a database perspective, provide a rich set of capabilities to support automatic maintenance. Moreover, the complex computations required to combine this type of data imply that incremental maintenance methods are almost always beneficial. This result contrasts some previous findings reported in the database literature. 1
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University