Results 1  10
of
1,045
Financial incentives and the “performance of crowds
 Proc. HCOMP ’09
"... The relationship between financial incentives and performance, long of interest to social scientists, has gained new relevance with the advent of webbased “crowdsourcing ” models of production. Here we investigate the effect of compensation on performance in the context of two experiments, conduct ..."
Abstract

Cited by 178 (1 self)
 Add to MetaCart
(Show Context)
The relationship between financial incentives and performance, long of interest to social scientists, has gained new relevance with the advent of webbased “crowdsourcing ” models of production. Here we investigate the effect of compensation on performance in the context of two experiments, conducted on Amazon’s Mechanical Turk (AMT). We find that increased financial incentives increase the quantity, but not the quality, of work performed by participants, where the difference appears to be due to an “anchoring ” effect: workers who were paid more also perceived the value of their work to be greater, and thus were no more motivated than workers paid less. In contrast with compensation levels, we find the details of the compensation scheme do matter—specifically, a “quota ” system results in better work for less pay than an equivalent “piece rate ” system. Although counterintuitive, these findings are consistent with previous laboratory studies, and may have realworld analogs as well.
Stochastic Variational Inference
 JOURNAL OF MACHINE LEARNING RESEARCH (2013, IN PRESS)
, 2013
"... We develop stochastic variational inference, a scalable algorithm for approximating posterior distributions. We develop this technique for a large class of probabilistic models and we demonstrate it with two probabilistic topic models, latent Dirichlet allocation and the hierarchical Dirichlet proce ..."
Abstract

Cited by 100 (23 self)
 Add to MetaCart
(Show Context)
We develop stochastic variational inference, a scalable algorithm for approximating posterior distributions. We develop this technique for a large class of probabilistic models and we demonstrate it with two probabilistic topic models, latent Dirichlet allocation and the hierarchical Dirichlet process topic model. Using stochastic variational inference, we analyze several large collections of documents: 300K articles from Nature, 1.8M articles from The New York Times, and 3.8M articles from Wikipedia. Stochastic inference can easily handle data sets of this size and outperforms traditional variational inference, which can only handle a smaller subset. (We also show that the Bayesian nonparametric topic model outperforms its parametric counterpart.) Stochastic variational inference lets us apply complex Bayesian models to massive data sets.
Standardizing the world income inequality database. Social Science Quarterly, 90(2): 231–242. SWIID Version 3.1, December 2011. Retrieved April 21, 2012, from http://www.siuc.edu/~fsolt/swiid/swiid.html
 American Economic Review
, 2009
"... Objective. Crossnational research on the causes and consequences of income inequality has been hindered by the limitations of existing inequality datasets: greater coverage across countries and over time is available from these sources only at the cost of significantly reduced comparability across ..."
Abstract

Cited by 80 (1 self)
 Add to MetaCart
(Show Context)
Objective. Crossnational research on the causes and consequences of income inequality has been hindered by the limitations of existing inequality datasets: greater coverage across countries and over time is available from these sources only at the cost of significantly reduced comparability across observations. The goal of the Standardized World Income Inequality Database (SWIID) is to overcome these limitations. Methods. A custom missingdata algorithm was used to standardize the United Nations University’s World Income Inequality Database; data collected by the Luxembourg Income Study served as the standard. Results. The SWIID provides comparable Gini indices of gross and net income inequality for 153 countries for as many years as possible from 1960 to the present along with estimates of uncertainty in these statistics. Conclusions. By maximizing comparability for the largest possible sample of countries and years, the SWIID is better suited to broadly crossnational research on income inequality than previously available sources. ∗For helpful comments, I am grateful to Stephen Bloom, Mariola Espinosa, and the anonymous reviewers. The SWIID data, along with replication materials, are available at my website:
A bayesian hierarchical topic model for political texts: Measuring expressed agendas in senate press releases
 In Proceedings of the First Workshop on Social Media Analytics, SOMA ’10
"... Political scientists lack methods to efficiently measure the priorities political actors emphasize in statements. To address this limitation, I introduce a statistical model that attends to the structure of political rhetoric when measuring expressed priorities: statements are naturally organized b ..."
Abstract

Cited by 53 (4 self)
 Add to MetaCart
(Show Context)
Political scientists lack methods to efficiently measure the priorities political actors emphasize in statements. To address this limitation, I introduce a statistical model that attends to the structure of political rhetoric when measuring expressed priorities: statements are naturally organized by author. The expressed agenda model exploits this structure to simultaneously estimate the topics in the texts, as well as the attention political actors allocate to the estimated topics. I apply the method to a collection of over 64,000 press releases from senators from 20052007, which I demonstrate is an ideal medium to measure how senators explain their work in Washington to constituents. A set of examples validates the estimated priorities and demonstrates that the additional information included in the model provides better classification than expert human coders or statistical models for clustering that ignore the author of a document. The statistical model and its extensions will be made available in a forthcoming free software package for the R computing language and the press release data will be made available for download. ∗PhD Candidate, Harvard University Department of Government. I thank the Center for American Political Studies
Parsing costs as predictors of reading difficulty: An evaluation using the Potsdam Sentence Corpus
 Journal of Eye Movement Research
, 2008
"... The surprisal of a word on a probabilistic grammar constitutes a promising complexity metric for human sentence comprehension difficulty. Using two different grammar types, surprisal is shown to have an effect on fixation durations and regression probabilities in a sample of German readers ’ eye mov ..."
Abstract

Cited by 53 (13 self)
 Add to MetaCart
(Show Context)
The surprisal of a word on a probabilistic grammar constitutes a promising complexity metric for human sentence comprehension difficulty. Using two different grammar types, surprisal is shown to have an effect on fixation durations and regression probabilities in a sample of German readers ’ eye movements, the Potsdam Sentence Corpus. A linear mixedeffects model was used to quantify the effect of surprisal while taking into account unigram frequency and bigram frequency (transitional probability), word length, and empiricallyderived word predictability; the socalled “early ” and “late ” measures of processing difficulty both showed an effect of surprisal. Surprisal is also shown to have a small but statistically nonsignificant effect on empiricallyderived predictability itself. This work thus demonstrates the importance of including parsing costs as a predictor of comprehension difficulty in models of reading, and suggests that a simple identification of syntactic parsing costs with early measures and late measures with durations of postsyntactic events may be difficult to uphold.
Struggles with Survey Weighting and Regression Modeling
 Statistical Science
, 2007
"... Abstract. The general principles of Bayesian data analysis imply that models for survey responses should be constructed conditional on all variables that affect the probability of inclusion and nonresponse, which are also the variables used in survey weighting and clustering. However, such models ca ..."
Abstract

Cited by 53 (3 self)
 Add to MetaCart
(Show Context)
Abstract. The general principles of Bayesian data analysis imply that models for survey responses should be constructed conditional on all variables that affect the probability of inclusion and nonresponse, which are also the variables used in survey weighting and clustering. However, such models can quickly become very complicated, with potentially thousands of poststratification cells. It is then a challenge to develop general families of multilevel probability models that yield reasonable Bayesian inferences. We discuss in the context of several ongoing public health and social surveys. This work is currently openended, and we conclude with thoughts on how research could proceed to solve these problems. Multilevel modeling, poststratification, samKey words and phrases:
Hierarchical Bayesian Domain Adaptation
"... Multitask learning is the problem of maximizing the performance of a system across a number of related tasks. When applied to multiple domains for the same task, it is similar to domain adaptation, but symmetric, rather than limited to improving performance on a target domain. We present a more pri ..."
Abstract

Cited by 50 (2 self)
 Add to MetaCart
(Show Context)
Multitask learning is the problem of maximizing the performance of a system across a number of related tasks. When applied to multiple domains for the same task, it is similar to domain adaptation, but symmetric, rather than limited to improving performance on a target domain. We present a more principled, better performing model for this problem, based on the use of a hierarchical Bayesian prior. Each domain has its own domainspecific parameter for each feature but, rather than a constant prior over these parameters, the model instead links them via a hierarchical Bayesian global prior. This prior encourages the features to have similar weights across domains, unless there is good evidence to the contrary. We show that the method of (Daumé III, 2007), which was presented as a simple “preprocessing step, ” is actually equivalent, except our representation explicitly separates hyperparameters which were tied in his work. We demonstrate that allowing different values for these hyperparameters significantly improves performance over both a strong baseline and (Daumé III, 2007) within both a conditional random field sequence model for named entity recognition and a discriminatively trained dependency parser. 1
Modeling Human Performance in Statistical Word Segmentation
"... What mechanisms support the ability of human infants, adults, and other primates to identify words from fluent speech using distributional regularities? In order to better characterize this ability, we collected data from adults in an artificial language segmentation task similar to Saffran, Newport ..."
Abstract

Cited by 47 (16 self)
 Add to MetaCart
(Show Context)
What mechanisms support the ability of human infants, adults, and other primates to identify words from fluent speech using distributional regularities? In order to better characterize this ability, we collected data from adults in an artificial language segmentation task similar to Saffran, Newport, and Aslin (1996) in which the length of sentences was systematically varied between groups of participants. We then compared the fit of a variety of computational models— including simple statistical models of transitional probability and mutual information, a clustering model based on mutual information by Swingley (2005), PARSER (Perruchet & Vintner, 1998), and a Bayesian model. We found that while all models were able to successfully complete the task, fit to the human data varied considerably, with the Bayesian model achieving the highest correlation with our results.
Repeatability for Gaussian and nonGaussian data: a practical guide for biologists. Biol Rev Camb Philos Soc 85:935–956
"... Repeatability (more precisely the common measure of repeatability, the intraclass correlation coefficient, ICC) is an important index for quantifying the accuracy of measurements and the constancy of phenotypes. It is the proportion of phenotypic variation that can be attributed to betweensubject ..."
Abstract

Cited by 44 (3 self)
 Add to MetaCart
(Show Context)
Repeatability (more precisely the common measure of repeatability, the intraclass correlation coefficient, ICC) is an important index for quantifying the accuracy of measurements and the constancy of phenotypes. It is the proportion of phenotypic variation that can be attributed to betweensubject (or betweengroup) variation. As a consequence, the nonrepeatable fraction of phenotypic variation is the sum of measurement error and phenotypic flexibility. There are several ways to estimate repeatability for Gaussian data, but there are no formal agreements on how repeatability should be calculated for nonGaussian data (e.g. binary, proportion and count data). In addition to point estimates, appropriate uncertainty estimates (standard errors and confidence intervals) and statistical significance for repeatability estimates are required regardless of the types of data. We review the methods for calculating repeatability and the associated statistics for Gaussian and nonGaussian data. For Gaussian data, we present three common approaches for estimating repeatability: correlationbased, analysis of variance (ANOVA)based and linear mixedeffects model (LMM)based methods, while for nonGaussian data, we focus on generalised linear mixedeffects models (GLMM) that allow the estimation of repeatability on the original and on the underlying latent scale. We also address a number of methods for calculating standard errors, confidence intervals and statistical significance; the most accurate and recommended methods are parametric bootstrapping, randomisation tests and Bayesian approaches. We advocate the use of LMM
Yes, But What’s the Mechanism? (Don’t Expect an Easy Answer)
"... Psychologists increasingly recommend experimental analysis of mediation. This is a step in the right direction because mediation analyses based on nonexperimental data are likely to be biased and because experiments, in principle, provide a sound basis for causal inference. But even experiments cann ..."
Abstract

Cited by 44 (0 self)
 Add to MetaCart
(Show Context)
Psychologists increasingly recommend experimental analysis of mediation. This is a step in the right direction because mediation analyses based on nonexperimental data are likely to be biased and because experiments, in principle, provide a sound basis for causal inference. But even experiments cannot overcome certain threats to inference that arise chiefly or exclusively in the context of mediation analysis—threats that have received little attention in psychology. The authors describe 3 of these threats and suggest ways to improve the exposition and design of mediation tests. Their conclusion is that inference about mediators is far more difficult than previous research suggests and is best tackled by an experimental research program that is specifically designed to address the challenges of mediation analysis.