• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 640,138
Next 10 →

A Comparative Study on Feature Selection in Text Categorization

by Yiming Yang, Jan O. Pedersen , 1997
"... This paper is a comparative study of feature selection methods in statistical learning of text categorization. The focus is on aggressive dimensionality reduction. Five methods were evaluated, including term selection based on document frequency (DF), information gain (IG), mutual information (MI), ..."
Abstract - Cited by 1294 (15 self) - Add to MetaCart
This paper is a comparative study of feature selection methods in statistical learning of text categorization. The focus is on aggressive dimensionality reduction. Five methods were evaluated, including term selection based on document frequency (DF), information gain (IG), mutual information (MI

Just Relax: Convex Programming Methods for Identifying Sparse Signals in Noise

by Joel A. Tropp , 2006
"... This paper studies a difficult and fundamental problem that arises throughout electrical engineering, applied mathematics, and statistics. Suppose that one forms a short linear combination of elementary signals drawn from a large, fixed collection. Given an observation of the linear combination that ..."
Abstract - Cited by 496 (2 self) - Add to MetaCart
This paper studies a difficult and fundamental problem that arises throughout electrical engineering, applied mathematics, and statistics. Suppose that one forms a short linear combination of elementary signals drawn from a large, fixed collection. Given an observation of the linear combination

A comparative analysis of selection schemes used in genetic algorithms

by David E. Goldberg, Kalyanmoy Deb - Foundations of Genetic Algorithms , 1991
"... This paper considers a number of selection schemes commonly used in modern genetic algorithms. Specifically, proportionate reproduction, ranking selection, tournament selection, and Genitor (or «steady state") selection are compared on the basis of solutions to deterministic difference or d ..."
Abstract - Cited by 512 (32 self) - Add to MetaCart
This paper considers a number of selection schemes commonly used in modern genetic algorithms. Specifically, proportionate reproduction, ranking selection, tournament selection, and Genitor (or «steady state") selection are compared on the basis of solutions to deterministic difference

Estimating Continuous Distributions in Bayesian Classifiers

by George John, Pat Langley - In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence , 1995
"... When modeling a probability distribution with a Bayesian network, we are faced with the problem of how to handle continuous variables. Most previous work has either solved the problem by discretizing, or assumed that the data are generated by a single Gaussian. In this paper we abandon the normality ..."
Abstract - Cited by 489 (2 self) - Add to MetaCart
the normality assumption and instead use statistical methods for nonparametric density estimation. For a naive Bayesian classifier, we present experimental results on a variety of natural and artificial domains, comparing two methods of density estimation: assuming normality and modeling each conditional

Model-Based Clustering, Discriminant Analysis, and Density Estimation

by Chris Fraley, Adrian E. Raftery - JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION , 2000
"... Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little ..."
Abstract - Cited by 557 (28 self) - Add to MetaCart
for model-based clustering that provides a principled statistical approach to these issues. We also show that this can be useful for other problems in multivariate analysis, such as discriminant analysis and multivariate density estimation. We give examples from medical diagnosis, mineeld detection, cluster

Estimating Wealth Effects without Expenditure Data— or Tears

by Deon Filmer, Lant Pritchett - Policy Research Working Paper 1980, The World , 1998
"... Abstract: We use the National Family Health Survey (NFHS) data collected in Indian states in 1992 and 1993 to estimate the relationship between household wealth and the probability a child (aged 6 to 14) is enrolled in school. A methodological difficulty to overcome is that the NFHS, modeled closely ..."
Abstract - Cited by 832 (16 self) - Add to MetaCart
Abstract: We use the National Family Health Survey (NFHS) data collected in Indian states in 1992 and 1993 to estimate the relationship between household wealth and the probability a child (aged 6 to 14) is enrolled in school. A methodological difficulty to overcome is that the NFHS, modeled

Estimating the number of clusters in a dataset via the Gap statistic

by Robert Tibshirani, Guenther Walther, Trevor Hastie , 2000
"... We propose a method (the \Gap statistic") for estimating the number of clusters (groups) in a set of data. The technique uses the output of any clustering algorithm (e.g. k-means or hierarchical), comparing the change in within cluster dispersion to that expected under an appropriate reference ..."
Abstract - Cited by 492 (1 self) - Add to MetaCart
We propose a method (the \Gap statistic") for estimating the number of clusters (groups) in a set of data. The technique uses the output of any clustering algorithm (e.g. k-means or hierarchical), comparing the change in within cluster dispersion to that expected under an appropriate reference

Local features and kernels for classification of texture and object categories: a comprehensive study

by J. Zhang, S. Lazebnik, C. Schmid - International Journal of Computer Vision , 2007
"... Recently, methods based on local image features have shown promise for texture and object recognition tasks. This paper presents a large-scale evaluation of an approach that represents images as distributions (signatures or histograms) of features extracted from a sparse set of keypoint locations an ..."
Abstract - Cited by 644 (35 self) - Add to MetaCart
and learns a Support Vector Machine classifier with kernels based on two effective measures for comparing distributions, the Earth Mover’s Distance and the χ 2 distance. We first evaluate the performance of our approach with different keypoint detectors and descriptors, as well as different kernels

How much should we trust differences-in-differences estimates? Quarterly Journal of Economics 119:249–75

by Marianne Bertrand, Esther Duflo, Sendhil Mullainathan, Abhijit Banerjee, Victor Chernozhukov, Michael Grossman, Jerry Hausman, Kei Hirano, Bo Honore , 2004
"... Most papers that employ Differences-in-Differences estimation (DD) use many years of data and focus on serially correlated outcomes but ignore that the resulting standard errors are incon-sistent. To illustrate the severity of this issue, we randomly generate placebo laws in state-level data on fema ..."
Abstract - Cited by 775 (1 self) - Add to MetaCart
Most papers that employ Differences-in-Differences estimation (DD) use many years of data and focus on serially correlated outcomes but ignore that the resulting standard errors are incon-sistent. To illustrate the severity of this issue, we randomly generate placebo laws in state-level data

Preference Parameters And Behavioral Heterogeneity: An Experimental Approach In The Health And Retirement Study

by Robert B. Barsky, F. Thomas Juster, Miles S. Kimball, Matthew D. Shapiro , 1997
"... This paper reports measures of preference parameters relating to risk tolerance, time preference, and intertemporal substitution. These measures are based on survey responses to hypothetical situations constructed using an economic theorist's concept of the underlying parameters. The individual ..."
Abstract - Cited by 524 (12 self) - Add to MetaCart
insurance, and holding stocks rather than Treasury bills. These relationships are both statistically and quantitatively significant, although measured risk tolerance explains only a small fraction of the variation of the studied behaviors.
Next 10 →
Results 1 - 10 of 640,138
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University