Results 11  20
of
459
A dynamic bayesian network click model for web search ranking
 In WWW
, 2009
"... As with any application of machine learning, web search ranking requires labeled data. The labels usually come in the form of relevance assessments made by editors. Click logs can also provide an important source of implicit feedback and can be used as a cheap proxy for editorial labels. The main di ..."
Abstract

Cited by 112 (10 self)
 Add to MetaCart
As with any application of machine learning, web search ranking requires labeled data. The labels usually come in the form of relevance assessments made by editors. Click logs can also provide an important source of implicit feedback and can be used as a cheap proxy for editorial labels. The main difficulty however comes from the so called position bias — urls appearing in lower positions are less likely to be clicked even if they are relevant. In this paper, we propose a Dynamic Bayesian Network which aims at providing us with unbiased estimation of the relevance from the click logs. Experiments show that the proposed click model outperforms other existing click models in predicting both clickthrough rate and relevance. Categories and Subject Descriptors H.3.3 [Information Search and Retrieval]; H.3.5 [Online
Using Speakers’ Referential Intentions to Model Early CrossSituational Word Learning
 PSYCHOLOGICAL SCIENCE
, 2009
"... Word learning is a ‘‘chicken and egg’’ problem. If a child could understand speakers ’ utterances, it would be easy to learn the meanings of individual words, and once a child knows what many words mean, it is easy to infer speakers’ intended meanings. To the beginning learner, however, both indivi ..."
Abstract

Cited by 95 (22 self)
 Add to MetaCart
Word learning is a ‘‘chicken and egg’’ problem. If a child could understand speakers ’ utterances, it would be easy to learn the meanings of individual words, and once a child knows what many words mean, it is easy to infer speakers’ intended meanings. To the beginning learner, however, both individual word meanings and speakers ’ intentions are unknown. We describe a computational model of word learning that solves these two inference problems in parallel, rather than relying exclusively on either the inferred meanings of utterances or crosssituational wordmeaning associations. We tested our model using annotated corpus data and found that it inferred pairings between words and object concepts with higher precision than comparison models. Moreover, as the result of making probabilistic inferences about speakers’ intentions, our model explains a variety of behavioral phenomena described in the wordlearning literature. These phenomena include mutual exclusivity, onetrial learning, crosssituational learning, the role of words in object individuation, and the use of inferred intentions to disambiguate reference.
Transdimensional Markov chain Monte Carlo
 in Highly Structured Stochastic Systems
, 2003
"... In the context of samplebased computation of Bayesian posterior distributions in complex stochastic systems, this chapter discusses some of the uses for a Markov chain with a prescribed invariant distribution whose support is a union of euclidean spaces of differing dimensions. This leads into a re ..."
Abstract

Cited by 91 (0 self)
 Add to MetaCart
In the context of samplebased computation of Bayesian posterior distributions in complex stochastic systems, this chapter discusses some of the uses for a Markov chain with a prescribed invariant distribution whose support is a union of euclidean spaces of differing dimensions. This leads into a reformulation of the reversible jump MCMC framework for constructing such ‘transdimensional ’ Markov chains. This framework is compared to alternative approaches for the same task, including methods that involve separate sampling within different fixeddimension models. We consider some of the difficulties researchers have encountered with obtaining adequate performance with some of these methods, attributing some of these to misunderstandings, and offer tentative recommendations about algorithm choice for various classes of problem. The chapter concludes with a look towards desirable future developments.
On Block Updating in Markov Random Field Models For . . .
 SCANDINAVIAN JOURNAL OF STATISTICS
, 2002
"... Gaussian Markov random field (GMRF) models are commonlyufz to model spatial correlation in disease mapping applications. For Bayesian inference by MCMC, so far mainly singlesiteuinglealgorithms have been considered. However, convergence and mixing properties ofsuD algorithms can be extremely ..."
Abstract

Cited by 85 (8 self)
 Add to MetaCart
Gaussian Markov random field (GMRF) models are commonlyufz to model spatial correlation in disease mapping applications. For Bayesian inference by MCMC, so far mainly singlesiteuinglealgorithms have been considered. However, convergence and mixing properties ofsuD algorithms can be extremely poordu to strong dependencies ofparameters in the posteriordistribuQ84K In this paper, we propose variou block sampling algorithms in order to improve the MCMC performance. The methodology is rather general, allows for nonstandardfu6 conditionals, and can be applied in amoduzK fashion in a large nugef of di#erent scenarios. For illu##Kzf0 n we consider three di#erent applications: twoformu8Df0z3 for spatial modelling of a single disease (with andwithou additionaluditionalfL parameters respectively), and one formu## ion for the joint analysis of two diseases. TheresuKK indicate that the largest benefits are obtained ifparameters and the corresponding hyperparameter areuefz#L jointly in one large block. Implementation ofsuQ block algorithms is relatively easy usyf methods for fast sampling ofGaungf3 Markov random fields (Rus 2001). By comparison, Monte Carlo estimates based on singlesiteungles can be rather misleading, even for very long rugfOu resuL6 may have wider relevance for efficient MCMCsimu6z8#f in hierarchical models with Markov random field components.
Building DomainSpecific Search Engines with Machine Learning Techniques
, 1999
"... Domainspecific search engines are becoming increasingly popular because they offer increased accuracy and extra features not possible with the general, Webwide search engines. For example, www.campsearch.com allows complex queries by agegroup, size, location and cost over summer camps. Unfortunate ..."
Abstract

Cited by 77 (6 self)
 Add to MetaCart
Domainspecific search engines are becoming increasingly popular because they offer increased accuracy and extra features not possible with the general, Webwide search engines. For example, www.campsearch.com allows complex queries by agegroup, size, location and cost over summer camps. Unfortunately, these domainspecific search engines are difficult and time consuming to maintain. This paper proposes the use of machine learning techniques to greatly automate the creation and maintenance of domainspecific search engines. We describe new research in reinforcement learning, text classification and information extraction that automates efficient spidering, populating topic hierarchies, and identifying informative text segments. Using these techniques, we have built a demonstration system: a search engine for computer science research papers. It already contains over 33,000 papers and is publicly available at www.cora.jprc.com. 1 Introduction As the amount of information on the World ...
Hierarchical SpatioTemporal Mapping of Disease Rates
 Journal of the American Statistical Association
, 1996
"... Maps of regional morbidity and mortality rates are useful tools in determining spatial patterns of disease. Combined with sociodemographic census information, they also permit assessment of environmental justice, i.e., whether certain subgroups suffer disproportionately from certain diseases or oth ..."
Abstract

Cited by 75 (7 self)
 Add to MetaCart
Maps of regional morbidity and mortality rates are useful tools in determining spatial patterns of disease. Combined with sociodemographic census information, they also permit assessment of environmental justice, i.e., whether certain subgroups suffer disproportionately from certain diseases or other adverse effects of harmful environmental exposures. Bayes and empirical Bayes methods have proven useful in smoothing crude maps of disease risk, eliminating the instability of estimates in lowpopulation areas while maintaining geographic resolution. In this paper we extend existing hierarchical spatial models to account for temporal effects and spatiotemporal interactions. Fitting the resulting highlyparametrized models requires careful implementation of Markov chain Monte Carlo (MCMC) methods, as well as novel techniques for model evaluation and selection. We illustrate our approach using a dataset of countyspecific lung cancer rates in the state of Ohio during the period 19681988...
Using Unlabeled Data to Improve Text Classification
, 2001
"... One key difficulty with text classification learning algorithms is that they require many handlabeled examples to learn accurately. This dissertation demonstrates that supervised learning algorithms that use a small number of labeled examples and many inexpensive unlabeled examples can create high ..."
Abstract

Cited by 70 (0 self)
 Add to MetaCart
One key difficulty with text classification learning algorithms is that they require many handlabeled examples to learn accurately. This dissertation demonstrates that supervised learning algorithms that use a small number of labeled examples and many inexpensive unlabeled examples can create highaccuracy text classifiers. By assuming that documents are created by a parametric generative model, ExpectationMaximization (EM) finds local maximum a posteriori models and classifiers from all the data  labeled and unlabeled. These generative models do not capture all the intricacies of text; however on some domains this technique substantially improves classification accuracy, especially when labeled data are sparse. Two problems arise from this basic approach. First, unlabeled data can hurt performance in domains where the generative modeling assumptions are too strongly violated. In this case the assumptions can be made more representative in two ways: by modeling subtopic class structure, and by modeling supertopic hierarchical class relationships. By doing so, model probability and classification accuracy come into correspondence, allowing unlabeled data to improve classification performance. The second problem is that even with a representative model, the improvements given by unlabeled data do not sufficiently compensate for a paucity of labeled data. Here, limited labeled data provide EM initializations that lead to lowprobability models. Performance can be significantly improved by using active learning to select highquality initializations, and by using alternatives to EM that avoid lowprobability local maxima.
A comparison of Bayesian and likelihoodbased methods for fitting multilevel models
"... this paper on the likelihoodbased (and approximate likelihood) methods most readily available (given current usage patterns of existing software) to statisticians and substantive researchers making frequent use of multilevel models: ML and REML in VC models, and MQL and PQL in RELR models. Other pr ..."
Abstract

Cited by 69 (7 self)
 Add to MetaCart
this paper on the likelihoodbased (and approximate likelihood) methods most readily available (given current usage patterns of existing software) to statisticians and substantive researchers making frequent use of multilevel models: ML and REML in VC models, and MQL and PQL in RELR models. Other promising likelihoodbased approaches including (a) methods based on Gaussian quadrature (e.g., Pinheiro and Bates 1995); (b) the nonparametric maximum likelihood methods of Airkin (1999a); (c) the Laplaceapproximation approach of Raudenbush et al. (1999); (d) the work on hierarchical generalised linear models of Lee and Nelder (2000); and (e) interval estimation based on ranges of values of the param eters for which the log likelihood is within a certain distance of its maximum, for instance using profile likelihood (e.g., Longford 2000)are not addressed here. It is evident from the recent applied literature that, from the point of view of multilevel analyses currently being conducted to inform educational and health policy choices and other substantive decisions, the use of methods (ae) is not (yet) as widespread as REML and quasilikelihood approaches. Statisticians are well aware that the highly skewed repeatedsampling distributions of ML estimators of randomeffects variances in multilevel models with small sample sizes are not likely to lead to good coverage properties for largesample Gaussian approximate interval estimates of the form r 2F 1.96 (2), but with few exceptions the profession has not (yet) responded to this by making software for improved likelihood interval estimates widely available to multilevel modellers. In Sections 3 and 4 we document the extent of the poor coverage behaviour of the Gaussian approach, and we offer several simple approximation ...
Simple and Effective Confidence Intervals for Proportions and Dierences of Proportions Result from Adding Two Successes and Two Failures
"... The standard confidence intervals for proportions and their differences used in introductory statistics courses have poor performance, the actual coverage probability often being much lower than intended. However, simple adjustments of these intervals based on adding four pseudo observations, half o ..."
Abstract

Cited by 66 (4 self)
 Add to MetaCart
The standard confidence intervals for proportions and their differences used in introductory statistics courses have poor performance, the actual coverage probability often being much lower than intended. However, simple adjustments of these intervals based on adding four pseudo observations, half of each type, perform surprisingly well even for small samples. To illustrate, for a broad variety of parameter settings with 10 observations in each sample, a nominal 95% interval for the difference of proportions has actual coverage probability below .93 in 88% of the cases with the standard interval but in only 1% with the adjusted interval; the mean distance between the nominal and actual coverage probabilities is .06 for the standard interval but .01 for the adjusted one. In teaching with these adjusted intervals, one can bypass awkward sample size guidelines and use the same formulas with small and large samples.