Results 1 -
5 of
5
Gene feature selection
"... This chapter presents an overview on the classes of methods available for feature selection, paying special attention to the problems typical to microarray data processing, where the number of measured genes (factors) is extremely large, in the order of thousands, and the number of relevant factors ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This chapter presents an overview on the classes of methods available for feature selection, paying special attention to the problems typical to microarray data processing, where the number of measured genes (factors) is extremely large, in the order of thousands, and the number of relevant factors is much smaller. The main ingredients needed in the selection of an optimal feature set consist in: the search procedures, the underlying optimality criteria, and the procedures for performance evaluation. We discuss here some of the major classes of procedures which are apparently very different in nature and goals: a typical Bayesian framework, several deterministic settings and finally information theoretic methods. Due to space constraints only the major issues are followed, with the intent to clarify the basic principles and the main options when choosing one of the many existing feature selection methods. 1
Fast Greedy Searching Informative Genes via Redundancy Bound Bootstrapping
"... The identification of informative genes is very important in study of genomics. This task can be interpreted as searching a subset of genes such that an optimal “ratio of quality to price ” is achieved. The “quality ” refers to the discrimination power of genes and the “price ” means the redundancy ..."
Abstract
- Add to MetaCart
The identification of informative genes is very important in study of genomics. This task can be interpreted as searching a subset of genes such that an optimal “ratio of quality to price ” is achieved. The “quality ” refers to the discrimination power of genes and the “price ” means the redundancy involved. This problem is NP hard. In contrast to many other methods, we discretize the gene expression profiling in this paper and approximate the optimizing process by combining individual ranking and sequential forward selection together to greedy searching informative genes in the context of a mathematical optimization formularization. The bootstrapping technique is employed to optimize a key parameter involved, namely, the redundancy bound, which reduces the greedy search cost extensively. The performance is evaluated and compared with previous results over publicly available microarray datasets.
BRIEFINGS IN BIOINFORMATICS. VOL 7. NO 1. 86^112 doi:10.1093/bib/bbk007 Machine learning in bioinformatics
, 2005
"... This article reviews machine learning methods for bioinformatics. It presents modelling methods, such as supervised classification, clustering and probabilistic graphical models for knowledge discovery, as well as deterministic and stochastic heuristics for optimization. Applications in genomics, pr ..."
Abstract
- Add to MetaCart
This article reviews machine learning methods for bioinformatics. It presents modelling methods, such as supervised classification, clustering and probabilistic graphical models for knowledge discovery, as well as deterministic and stochastic heuristics for optimization. Applications in genomics, proteomics, systems biology, evolution and text mining are also shown.

