Results 1  10
of
24
Approximate Bayes Factors and Accounting for Model Uncertainty in Generalized Linear Models
, 1993
"... Ways of obtaining approximate Bayes factors for generalized linear models are described, based on the Laplace method for integrals. I propose a new approximation which uses only the output of standard computer programs such as GUM; this appears to be quite accurate. A reference set of proper priors ..."
Abstract

Cited by 96 (28 self)
 Add to MetaCart
Ways of obtaining approximate Bayes factors for generalized linear models are described, based on the Laplace method for integrals. I propose a new approximation which uses only the output of standard computer programs such as GUM; this appears to be quite accurate. A reference set of proper priors is suggested, both to represent the situation where there is not much prior information, and to assess the sensitivity of the results to the prior distribution. The methods can be used when the dispersion parameter is unknown, when there is overdispersion, to compare link functions, and to compare error distributions and variance functions. The methods can be used to implement the Bayesian approach to accounting for model uncertainty. I describe an application to inference about relative risks in the presence of control factors where model uncertainty is large and important. Software to implement the
Bayesian Inference for Semiparametric Binary Regression
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 1996
"... We propose a regression model for binary response data which places no structural restrictions on the link function except monotonicity and known location and scale. Predictors enter linearly. We demonstrate Bayesian inference calculations in this model. By modifying the Dirichlet process, we obtain ..."
Abstract

Cited by 24 (2 self)
 Add to MetaCart
We propose a regression model for binary response data which places no structural restrictions on the link function except monotonicity and known location and scale. Predictors enter linearly. We demonstrate Bayesian inference calculations in this model. By modifying the Dirichlet process, we obtain a natural prior measure over this semiparametric model, and we use Polya sequence theory to formulate this measure in terms of a finite number of unobserved variables. A Markov chain Monte Carlo algorithm is designed for posterior simulation, and the methodology is applied to data on radiotherapy treatments for cancer.
From Planning to Mature: on the Determinants of Open Source Take Off
, 2005
"... In this paper we use data from SourceForge.net, the largest open source projects repository, to estimate the main determinants of the progress in the development of a stable and mature code of a software. We find that the less restrictive the licensing terms the larger the likelihood of reaching an ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
In this paper we use data from SourceForge.net, the largest open source projects repository, to estimate the main determinants of the progress in the development of a stable and mature code of a software. We find that the less restrictive the licensing terms the larger the likelihood of reaching an advanced development status and that this effect is even stronger for newer projects. We also find that projects geared towards system administrators appear to be the more successful ones. The determinants of projects’ development stage change with the age of the project in many dimensions, i.e. licensing terms, software audience and contents, thus supporting the common perception of opens source as a very dynamic phenomenon. The data seem to suggest that open source is evolving towards more commercial applications.
Parametric Links for Binary Choice Models: A FisherianBayesian Colloquy
 Journal of Econometrics
, 2009
"... Abstract. The familiar logit and probit models provide convenient settings for many binary response applications, but a larger class of link functions may be occasionally desirable. Two parametric families of link functions are investigated: the Gosset link based on the Student t latent variable mod ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Abstract. The familiar logit and probit models provide convenient settings for many binary response applications, but a larger class of link functions may be occasionally desirable. Two parametric families of link functions are investigated: the Gosset link based on the Student t latent variable model with the degrees of freedom parameter controlling the tail behavior, and the Pregibon link based on the (generalized) Tukey λ family with two shape parameters controlling skewness and tail behavior. Both Bayesian and maximum likelihood methods for estimation and inference are explored, compared and contrasted. In applications, like the propensity score matching problem discussed in Section 4, where it is critical to have accurate estimates of the conditional probabilities, we find that misspecification of the link function can create serious bias. Bayesian point estimation via MCMC performs quite competitively with MLE methods; however nominal coverage of Bayes credible regions is somewhat more problematic. 1.
Fractional regression models for second stage DEA efficiency analyses ∗
, 2010
"... Data envelopment analysis (DEA) is commonly used to measure the relative efficiency of decisionmaking units. Often, in a second stage, a regression model is estimated to relate DEA efficiency scores to exogenous factors. In this paper, we argue that the traditional linear or tobit approaches to sec ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Data envelopment analysis (DEA) is commonly used to measure the relative efficiency of decisionmaking units. Often, in a second stage, a regression model is estimated to relate DEA efficiency scores to exogenous factors. In this paper, we argue that the traditional linear or tobit approaches to secondstage DEA analysis do not constitute a reasonable datagenerating process for DEA scores. Under the assumption that DEA scores can be treated as descriptive measures of the relative performance of units in the sample, we show that using fractional regression models are the most natural way of modeling bounded, proportional response variables such as DEA scores. We also propose generalizations of these models and, given that DEA scores take frequently the value of unity, examine the use of twopart models in this framework. Several tests suitable for assessing the specification of each alternative model are also discussed.
Markups, Entry Regulation and Trade: Does Country Size Matter?
, 2001
"... Actual and potential competition is a powerful source of discipline on the pricing behavior of firms with market power. A simple model is developed that shows that the effects of import competition and domestic entry regulation on industry pricecost markups depend on country size. Barriers to do ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Actual and potential competition is a powerful source of discipline on the pricing behavior of firms with market power. A simple model is developed that shows that the effects of import competition and domestic entry regulation on industry pricecost markups depend on country size. Barriers to domestic entry are predicted to have stronger anticompetitive effects in large countries, whereas the impact of barriers to foreign entry (i.e., imports) should be stronger in small countries. Following estimation of markups for manufacturing sectors in 41 developed and developing countries, these hypotheses are tested and cannot be rejected by the data. For example, although Italy and Indonesia impose the same number of regulations on entry of new firms, their impact on manufacturing markups is 20 percent higher in Italy due to its larger size.
Noncanonical links in generalized linear models  when is the effort justified?
 Journal of Statistical Planning and Inference
, 2000
"... Generalized linear models (GLMs) allow for a wide range of statistical models for regression data. In particular, the logistic model is usually applied for binomial observations. Canonical links for GLM's such as the logit link in the binomial case, are often used because in this case minimal suffic ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Generalized linear models (GLMs) allow for a wide range of statistical models for regression data. In particular, the logistic model is usually applied for binomial observations. Canonical links for GLM's such as the logit link in the binomial case, are often used because in this case minimal sufficient statistics for the regression parameter exist which allow for simple interpretation of the results. However, in some applications, the overall fit as measured by the pvalues of goodness of fit statistics (as the residual deviance) can be improved significantly by the use of a noncanonical link. In this case, the interpretation of the influence of the covariables is more complicated compared to GLM's with canonical link functions. It will be illustrated through simulation that the pvalue associated with the common goodness of link tests is not appropriate to quantify the changes to mean response estimates and other quantities of interest when switching to a noncanonical link. In particular, the rate of misspecifications becomes considerably large, when the inverse information value associated with the underlying parametric link model increases. This shows that the classical tests are often too sensitive, in particular, when the number of observations is large. The consideration of a generalized pvalue function is proposed instead, which allows the exact quantification of a suitable distance to the canonical model at a controlled error rate. Corresponding tests for validating or discriminating the canonical model can easily performed by means of this function. Finally, it is indicated how this method can be applied to the problem of overdispersion.
Pearson's Goodness of Fit Statistic as a Score Test Statistic
"... For any generalized linear model, the Pearson goodness of fit statistic is the score test statistic for testing the current model against the saturated model. The relationship between the Pearson statistic and the residual deviance is therefore the relationship between the score test and the likelih ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
For any generalized linear model, the Pearson goodness of fit statistic is the score test statistic for testing the current model against the saturated model. The relationship between the Pearson statistic and the residual deviance is therefore the relationship between the score test and the likelihood ratio test statistics, and this clarifies the role of the Pearson statistic in generalized linear models. The result is extended to cases in which there are multiple reponse observations for the same combination of explanatory variables.
Link function selection in stochastic multicriteria decision making models
 European Journal of Operational Research
, 2006
"... A stochastic formulation of the Analytic Hierarchy Process (AHP) using an approach based on Bayesian categorical data models has been developed. However, in categorical data models it is known that the selection of the link function may have an impact on the model estimates. In particular, the selec ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
A stochastic formulation of the Analytic Hierarchy Process (AHP) using an approach based on Bayesian categorical data models has been developed. However, in categorical data models it is known that the selection of the link function may have an impact on the model estimates. In particular, the selection of the probit link implies an assumption that model error terms are normally distributed and this normality assumption is regularly utilized in other related methods such as the multiplicative AHP. We examine model performance with respect to the choice of two model link functions. With regard to point estimates, it is found that the logit formulation is better able to replicate the estimates obtained by the eigenvector decomposition associated with the original formulation of the AHP. By contrast, the probit link produces priorities which are consistently more moderate than those of the AHP. The results suggest that the logit formulation will be preferred by decision makers who wish to replicate the AHP priorities as closely as possible. The results also suggest that the unexamined use of the normality assumption in other stochastic AHP methods may have an impact on priority estimates and thus is worthy of further attention.
Robust Parameter Design with Uncontrolled Noise Variables
"... this paper, we consider situations where the noise variables can vary but are observable. (In the computer/communication network applications, for example, it is possible to measure the various load conditions and so adjust for the changes in the noise variables.) We propose a general data analysis ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
this paper, we consider situations where the noise variables can vary but are observable. (In the computer/communication network applications, for example, it is possible to measure the various load conditions and so adjust for the changes in the noise variables.) We propose a general data analysis strategy which is based on modeling the responses directly. Our approach involves treating the noise variables as covariates and modeling both the location parameters and the (regression) coefficients as functions of the design factors. This allows us to determine the interactions between the design factors and the observed noise variables and to exploit them to reduce the effect of this source of variability. Variability due to unobserved noise variables can be identified by analyzing the squared residuals from the fitted model. The approach presented here is also applicable to experimental situations with covariates, where one has to remove the effects of these nuisance variables before identifying the important location and dispersion effects. The paper is organized as follows. In Section 2, we use a real application on thermal design of cabinets for telecommunications switching equipment to motivate the problem and the issues. Section 3 develops the underlying concepts and models for a single observed noise variable. The proposed data analysis strategy is outlined in Section 4 and is illustrated by applying it to the thermal design experiment. Section 5 deals with several generalizations, including the case of multiple noise variables. The direct modeling of responses as a function of design factors and noise variables has also been considered by Welch et al. (1990), Shoemaker et al. (1991), and Lucas (1990). Our approach simplifies to the formulations discussed by these ...