Results 11  20
of
226
Building DomainSpecific Search Engines with Machine Learning Techniques
, 1999
"... Domainspecific search engines are becoming increasingly popular because they offer increased accuracy and extra features not possible with the general, Webwide search engines. For example, www.campsearch.com allows complex queries by agegroup, size, location and cost over summer camps. Unfortunate ..."
Abstract

Cited by 63 (6 self)
 Add to MetaCart
Domainspecific search engines are becoming increasingly popular because they offer increased accuracy and extra features not possible with the general, Webwide search engines. For example, www.campsearch.com allows complex queries by agegroup, size, location and cost over summer camps. Unfortunately, these domainspecific search engines are difficult and time consuming to maintain. This paper proposes the use of machine learning techniques to greatly automate the creation and maintenance of domainspecific search engines. We describe new research in reinforcement learning, text classification and information extraction that automates efficient spidering, populating topic hierarchies, and identifying informative text segments. Using these techniques, we have built a demonstration system: a search engine for computer science research papers. It already contains over 33,000 papers and is publicly available at www.cora.jprc.com. 1 Introduction As the amount of information on the World ...
Bayesian Model Selection and Model Averaging
, 1999
"... This paper reviews the Bayesian approach to model selection and model averaging. In this review, I emphasize objective Bayesian methods based on noninformative priors. I will also discuss implementation details, approximations and relationships to other methods. KEY WORDS AND PHRASES: AIC, Bayes Fac ..."
Abstract

Cited by 57 (0 self)
 Add to MetaCart
This paper reviews the Bayesian approach to model selection and model averaging. In this review, I emphasize objective Bayesian methods based on noninformative priors. I will also discuss implementation details, approximations and relationships to other methods. KEY WORDS AND PHRASES: AIC, Bayes Factors, BIC, Consistency, Default Bayes Methods, Markov Chain Monte Carlo.
Transdimensional Markov chain Monte Carlo
 in Highly Structured Stochastic Systems
, 2003
"... In the context of samplebased computation of Bayesian posterior distributions in complex stochastic systems, this chapter discusses some of the uses for a Markov chain with a prescribed invariant distribution whose support is a union of euclidean spaces of differing dimensions. This leads into a re ..."
Abstract

Cited by 56 (0 self)
 Add to MetaCart
In the context of samplebased computation of Bayesian posterior distributions in complex stochastic systems, this chapter discusses some of the uses for a Markov chain with a prescribed invariant distribution whose support is a union of euclidean spaces of differing dimensions. This leads into a reformulation of the reversible jump MCMC framework for constructing such ‘transdimensional ’ Markov chains. This framework is compared to alternative approaches for the same task, including methods that involve separate sampling within different fixeddimension models. We consider some of the difficulties researchers have encountered with obtaining adequate performance with some of these methods, attributing some of these to misunderstandings, and offer tentative recommendations about algorithm choice for various classes of problem. The chapter concludes with a look towards desirable future developments.
On Block Updating in Markov Random Field Models For . . .
 SCANDINAVIAN JOURNAL OF STATISTICS
, 2002
"... Gaussian Markov random field (GMRF) models are commonlyufz to model spatial correlation in disease mapping applications. For Bayesian inference by MCMC, so far mainly singlesiteuinglealgorithms have been considered. However, convergence and mixing properties ofsuD algorithms can be extremely ..."
Abstract

Cited by 51 (7 self)
 Add to MetaCart
Gaussian Markov random field (GMRF) models are commonlyufz to model spatial correlation in disease mapping applications. For Bayesian inference by MCMC, so far mainly singlesiteuinglealgorithms have been considered. However, convergence and mixing properties ofsuD algorithms can be extremely poordu to strong dependencies ofparameters in the posteriordistribuQ84K In this paper, we propose variou block sampling algorithms in order to improve the MCMC performance. The methodology is rather general, allows for nonstandardfu6 conditionals, and can be applied in amoduzK fashion in a large nugef of di#erent scenarios. For illu##Kzf0 n we consider three di#erent applications: twoformu8Df0z3 for spatial modelling of a single disease (with andwithou additionaluditionalfL parameters respectively), and one formu## ion for the joint analysis of two diseases. TheresuKK indicate that the largest benefits are obtained ifparameters and the corresponding hyperparameter areuefz#L jointly in one large block. Implementation ofsuQ block algorithms is relatively easy usyf methods for fast sampling ofGaungf3 Markov random fields (Rus 2001). By comparison, Monte Carlo estimates based on singlesiteungles can be rather misleading, even for very long rugfOu resuL6 may have wider relevance for efficient MCMCsimu6z8#f in hierarchical models with Markov random field components.
Hierarchical SpatioTemporal Mapping of Disease Rates
 Journal of the American Statistical Association
, 1996
"... Maps of regional morbidity and mortality rates are useful tools in determining spatial patterns of disease. Combined with sociodemographic census information, they also permit assessment of environmental justice, i.e., whether certain subgroups suffer disproportionately from certain diseases or oth ..."
Abstract

Cited by 51 (7 self)
 Add to MetaCart
Maps of regional morbidity and mortality rates are useful tools in determining spatial patterns of disease. Combined with sociodemographic census information, they also permit assessment of environmental justice, i.e., whether certain subgroups suffer disproportionately from certain diseases or other adverse effects of harmful environmental exposures. Bayes and empirical Bayes methods have proven useful in smoothing crude maps of disease risk, eliminating the instability of estimates in lowpopulation areas while maintaining geographic resolution. In this paper we extend existing hierarchical spatial models to account for temporal effects and spatiotemporal interactions. Fitting the resulting highlyparametrized models requires careful implementation of Markov chain Monte Carlo (MCMC) methods, as well as novel techniques for model evaluation and selection. We illustrate our approach using a dataset of countyspecific lung cancer rates in the state of Ohio during the period 19681988...
Using Unlabeled Data to Improve Text Classification
, 2001
"... One key difficulty with text classification learning algorithms is that they require many handlabeled examples to learn accurately. This dissertation demonstrates that supervised learning algorithms that use a small number of labeled examples and many inexpensive unlabeled examples can create high ..."
Abstract

Cited by 49 (0 self)
 Add to MetaCart
One key difficulty with text classification learning algorithms is that they require many handlabeled examples to learn accurately. This dissertation demonstrates that supervised learning algorithms that use a small number of labeled examples and many inexpensive unlabeled examples can create highaccuracy text classifiers. By assuming that documents are created by a parametric generative model, ExpectationMaximization (EM) finds local maximum a posteriori models and classifiers from all the data  labeled and unlabeled. These generative models do not capture all the intricacies of text; however on some domains this technique substantially improves classification accuracy, especially when labeled data are sparse. Two problems arise from this basic approach. First, unlabeled data can hurt performance in domains where the generative modeling assumptions are too strongly violated. In this case the assumptions can be made more representative in two ways: by modeling subtopic class structure, and by modeling supertopic hierarchical class relationships. By doing so, model probability and classification accuracy come into correspondence, allowing unlabeled data to improve classification performance. The second problem is that even with a representative model, the improvements given by unlabeled data do not sufficiently compensate for a paucity of labeled data. Here, limited labeled data provide EM initializations that lead to lowprobability models. Performance can be significantly improved by using active learning to select highquality initializations, and by using alternatives to EM that avoid lowprobability local maxima.
Randomeffects analysis
 In
, 2004
"... of the structural measures of flexibility and agility using a measurement theoretical framework $ ..."
Abstract

Cited by 44 (4 self)
 Add to MetaCart
of the structural measures of flexibility and agility using a measurement theoretical framework $
Using Speakers’ Referential Intentions to Model Early CrossSituational Word Learning
 PSYCHOLOGICAL SCIENCE
, 2009
"... Word learning is a ‘‘chicken and egg’’ problem. If a child could understand speakers ’ utterances, it would be easy to learn the meanings of individual words, and once a child knows what many words mean, it is easy to infer speakers’ intended meanings. To the beginning learner, however, both indivi ..."
Abstract

Cited by 41 (4 self)
 Add to MetaCart
Word learning is a ‘‘chicken and egg’’ problem. If a child could understand speakers ’ utterances, it would be easy to learn the meanings of individual words, and once a child knows what many words mean, it is easy to infer speakers’ intended meanings. To the beginning learner, however, both individual word meanings and speakers ’ intentions are unknown. We describe a computational model of word learning that solves these two inference problems in parallel, rather than relying exclusively on either the inferred meanings of utterances or crosssituational wordmeaning associations. We tested our model using annotated corpus data and found that it inferred pairings between words and object concepts with higher precision than comparison models. Moreover, as the result of making probabilistic inferences about speakers’ intentions, our model explains a variety of behavioral phenomena described in the wordlearning literature. These phenomena include mutual exclusivity, onetrial learning, crosssituational learning, the role of words in object individuation, and the use of inferred intentions to disambiguate reference.
A Novel Evolutionary Data Mining Algorithm With Applications to Churn Prediction
, 2003
"... Classification is an important topic in data mining research. Given a set of data records, each of which belongs to one of a number of predefined classes, the classification problem is concerned with the discovery of classification rules that can allow records with unknown class membership to be cor ..."
Abstract

Cited by 40 (4 self)
 Add to MetaCart
Classification is an important topic in data mining research. Given a set of data records, each of which belongs to one of a number of predefined classes, the classification problem is concerned with the discovery of classification rules that can allow records with unknown class membership to be correctly classified. Many algorithms have been developed to mine large data sets for classification models and they have been shown to be very effective. However, when it comes to determining the likelihood of each classification made, many of them are not designed with such purpose in mind. For this, they are not readily applicable to such problem as churn prediction. For such an application, the goal is not only to predict whether or not a subscriber would switch from one carrier to another, it is also important that the likelihood of the subscriber's doing so be predicted. The reason for this is that a carrier can then choose to provide special personalized offer and services to those subscribers who are predicted with higher likelihood to churn. Given its importance, we propose a new data mining algorithm, called data mining by evolutionary learning (DMEL), to handle classification problems of which the accuracy of each predictions made has to be estimated. In performing its tasks, DMEL searches through the possible rule space using an evolutionary approach that has the following characteristics: 1) the evolutionary process begins with the generation of an initial set of firstorder rules (i.e., rules with one conjunct/condition) using a probabilistic induction technique and based on these rules, rules of higher order (two or more conjuncts) are obtained iteratively; 2) when identifying interesting rules, an objective interestingness measure is used; 3) the fitness of a ch...
Spatiallyadaptive penalties for spline fitting
 Australian and New Zealand Journal of Statistics
, 2000
"... We study spline fitting with a roughness penalty that adapts to spatial heterogeneity in the regression function. Our estimates are pth degree piecewise polynomials with p − 1 continuous derivatives. A large and fixed number of knots is used and smoothing is achieved by putting a quadratic penalty ..."
Abstract

Cited by 34 (6 self)
 Add to MetaCart
We study spline fitting with a roughness penalty that adapts to spatial heterogeneity in the regression function. Our estimates are pth degree piecewise polynomials with p − 1 continuous derivatives. A large and fixed number of knots is used and smoothing is achieved by putting a quadratic penalty on the jumps of the pth derivative at the knots. To be spatially adaptive, the logarithm of the penalty is itself a linear spline but with relatively few knots and with values at the knots chosen to minimize GCV. This locallyadaptive spline estimator is compared with other spline estimators in the literature such as cubic smoothing splines and knotselection techniques for leastsquares regression. Our estimator can be interpreted as an empirical Bayes estimate for a prior allowing spatial heterogeneity. In cases of spatially heterogeneous regression functions,