• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

W: LOTUS: An algorithm for building accurate and comprehensible logistic regression trees (0)

by K Chan, Loh
Venue:J Comput Graph Stat
Add To MetaCart

Tools

Sorted by:
Results 1 - 8 of 8

LASSO-Patternsearch Algorithm with Application to Ophthalmology and Genomic Data

by Weiliang Shi, Grace Wahba, Stephen Wright, Kristine Lee, Ronald Klein, Barbara Klein , 2008
"... The LASSO-Patternsearch algorithm is proposed to efficiently identify patterns of multiple dichotomous risk factors for outcomes of interest in demographic and genomic studies. The patterns considered are those that arise naturally from the log linear expansion of the multivariate Bernoulli density. ..."
Abstract - Cited by 10 (8 self) - Add to MetaCart
The LASSO-Patternsearch algorithm is proposed to efficiently identify patterns of multiple dichotomous risk factors for outcomes of interest in demographic and genomic studies. The patterns considered are those that arise naturally from the log linear expansion of the multivariate Bernoulli density. The method is designed for the case where there is a possibly very large number of candidate patterns but it is believed that only a relatively small number are important. A LASSO is used to greatly reduce the number of candidate patterns, using a novel computational algorithm that can handle an extremely large number of unknowns simultaneously. The patterns surviving the LASSO are further pruned in the framework of (parametric) generalized linear models. A novel tuning procedure based on the GACV for Bernoulli outcomes, modified to act

Investigator AwardsLASSO-Patternsearch Algorithm with Application to Ophthalmology and Genomic Data

by Weiliang Shi, Grace Wahba, Stephen Wright, Kristine Lee, Ronald Klein, Barbara Klein, Weiliang Shi, Grace Wahba, Stephen Wright, Kristine Lee, Ronald Klein, Barbara Klein , 2008
"... The LASSO-Patternsearch algorithm is proposed as a two-step method to identify clusters or patterns of multiple risk factors for outcomes of interest in demographic and genomic studies. The predictor variables are dichotomous or can be coded as dichotomous. Many diseases are suspected of having mult ..."
Abstract - Add to MetaCart
The LASSO-Patternsearch algorithm is proposed as a two-step method to identify clusters or patterns of multiple risk factors for outcomes of interest in demographic and genomic studies. The predictor variables are dichotomous or can be coded as dichotomous. Many diseases are suspected of having multiple interacting risk factors acting in concert, and it is of much interest to uncover higher order interactions or risk patterns when they exist. The patterns considered here are those that arise naturally from the log linear expansion of the multivariate Bernoulli density. The method is designed for the case where there is a possibly very large number of candidate patterns but it is believed that only a relatively small number are important. A LASSO is used to greatly reduce the number of candidate patterns, using a novel computational algorithm that can handle an extremely large number of unknowns simultaneously. Then the patterns surviving the LASSO are further pruned in the framework of (parametric) generalized linear models. A novel tuning procedure based on the GACV for Bernoulli

06-0095. LASSO-Patternsearch Algorithm By

by Weiliang Shi, Weiliang Shi , 2008
"... The LASSO-Patternsearch Algorithm and its variant the Grouped LASSO-Patternsearch Algorithm are proposed to efficiently identify patterns of multiple dichotomous risk factors for outcomes of interest in demographic and genomic studies. The patterns considered are those that arise naturally from the ..."
Abstract - Add to MetaCart
The LASSO-Patternsearch Algorithm and its variant the Grouped LASSO-Patternsearch Algorithm are proposed to efficiently identify patterns of multiple dichotomous risk factors for outcomes of interest in demographic and genomic studies. The patterns considered are those that arise naturally from the log linear expansion of the multivariate Bernoulli density. Both methods are designed for the case where there is a possibly very large number of candidate patterns but it is believed that only a relatively small number are important. In the LASSO-Patternsearch Algorithm, a LASSO is used to greatly reduce the number of candidate patterns, using a novel computational algorithm that can handle an extremely large number of unknowns simultaneously. The patterns surviving the LASSO are further pruned in the framework of (parametric) generalized linear models. A novel tuning procedure based on the GACV for Bernoulli outcomes, modified to act as a model selector, is used at both steps. We applied the method to myopia data from the population-based Beaver Dam Eye Study, exposing physiologically interesting interacting risk factors. We then

On Oblique Random Forests

by Bjoern H. Menze, B. Michael Kelm, Daniel N. Splitthoff, Ullrich Koethe, Fred A. Hamprecht
"... Abstract. In his original paper on random forests, Breiman proposed two different decision tree ensembles: one generated from “orthogonal” trees with thresholds on individual features in every split, and one from “oblique ” trees separating the feature space by randomly oriented hyperplanes. In spit ..."
Abstract - Add to MetaCart
Abstract. In his original paper on random forests, Breiman proposed two different decision tree ensembles: one generated from “orthogonal” trees with thresholds on individual features in every split, and one from “oblique ” trees separating the feature space by randomly oriented hyperplanes. In spite of a rising interest in the random forest framework, however, ensembles built from orthogonal trees (RF) have gained most, if not all, attention so far. In the present work we propose to employ “oblique ” random forests (oRF) built from multivariate trees which explicitly learn optimal split directions at internal nodes using linear discriminative models, rather than using random coefficients as the original oRF. This oRF outperforms RF, as well as other classifiers, on nearly all data sets but those with discrete factorial features. Learned node models perform distinctively better than random splits. An oRF feature importance score shows to be preferable over standard RF feature importance scores such as Gini or permutation importance. The topology of the oRF decision space appears to be smoother and better adapted to the data, resulting in improved generalization performance. Overall, the oRF propose here may be preferred over standard RF on most learning tasks involving numerical and spectral data. 1

22 Evolutionary Algorithms in Decision Tree Induction

by Francesco Mola, Raffaele Miele
"... One of the biggest problem that many data analysis techniques have to deal with nowadays is Combinatorial Optimization that, in the past, has led many methods to be taken apart. Actually, the (still not enough!) higher computing power available makes it possible to apply such techniques within certa ..."
Abstract - Add to MetaCart
One of the biggest problem that many data analysis techniques have to deal with nowadays is Combinatorial Optimization that, in the past, has led many methods to be taken apart. Actually, the (still not enough!) higher computing power available makes it possible to apply such techniques within certain bounds. Since other research fields like Artificial

Stepwise Induction of Logistic Model Trees

by Michelangelo Ceci, Donato Malerba, Savino Saponara
"... Abstract. In statistics, logistic regression is a regression model to predict a binomially distributed response variable. Recent research has investigated the opportunity of combining logistic regression with decision tree learners. Following this idea, we propose a novel Logistic Model Tree inducti ..."
Abstract - Add to MetaCart
Abstract. In statistics, logistic regression is a regression model to predict a binomially distributed response variable. Recent research has investigated the opportunity of combining logistic regression with decision tree learners. Following this idea, we propose a novel Logistic Model Tree induction system, SILoRT, which induces trees with two types of nodes: regression nodes, which perform only univariate logistic regression, and splitting nodes, which partition the feature space. The multiple regression model associated with a leaf is then built stepwise by combining univariate logistic regressions along the path from the root to the leaf. Internal regression nodes contribute to the definition of multiple models and have a global effect, while univariate regressions at leaves have only local effects. Experimental results are reported. 1

Investigator AwardsLASSO-Patternsearch Algorithm with Application to Ophthalmology Data

by Kristine Lee, Ronald Klein, Barbara Klein, Weiliang Shi, Grace Wahba, Stephen Wright, Kristine Lee, Ronald Klein, Barbara Klein , 2008
"... The LASSO-Patternsearch is proposed, as a two-stage procedure to identify clusters of multiple risk factors for outcomes of interest in large demographic studies, when the predictor variables are dichotomous or take on values in a small finite set. Many diseases are suspected of having multiple inte ..."
Abstract - Add to MetaCart
The LASSO-Patternsearch is proposed, as a two-stage procedure to identify clusters of multiple risk factors for outcomes of interest in large demographic studies, when the predictor variables are dichotomous or take on values in a small finite set. Many diseases are suspected of having multiple interacting risk factors acting in concert, and it is of much interest to uncover higher order interactions when they exist. The method is related to Zhang et al(2004) except that variable flexibility is sacrificed to allow entertaining models with high as well as low order interactions among multiple predictors. A LASSO is used to select important patterns, being applied conservatively to have a high rate of retention of true patterns, while allowing some noise. Then the patterns selected by the LASSO are tested in the framework of (parametric) generalized linear models to reduce the noise. Notably, the patterns are those that arise naturally from the log linear expansion of the multivariate Bernoulli density. Separate tuning procedures are proposed for the LASSO step and then the parametric step and a novel

BMC Proceedings BioMed Central

by Radoslav Z Nickolov, Valentin B Milanov, Radoslav Z Nickolov, Valentin B Milanov , 2007
"... Proceedings Logistic regression trees for initial selection of interesting loci in case-control studies ..."
Abstract - Add to MetaCart
Proceedings Logistic regression trees for initial selection of interesting loci in case-control studies
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University