## The VGAM Package for Categorical Data Analysis

### Cached

### Download Links

Citations: | 11 - 0 self |

### BibTeX

@MISC{Yee_thevgam,

author = {Thomas W. Yee},

title = {The VGAM Package for Categorical Data Analysis},

year = {}

}

### OpenURL

### Abstract

Classical categorical regression models such as the multinomial logit and proportional odds models are shown to be readily handled by the vector generalized linear and additive model (VGLM/VGAM) framework. Additionally, there are natural extensions, such as reduced-rank VGLMs for dimension reduction, and allowing covariates that have values specific to each linear/additive predictor, e.g., for consumer choice modeling. This article describes some of the framework behind the VGAM R package, its usage and implementation details.

### Citations

1582 | Generalized linear models - McCullagh, Neldar - 1989 |

1410 | An Introduction to Categorical Data Analysis - Agresti - 2007 |

325 | Core Team (2009). R: A Language and Environment for Statistical Computing - Development |

298 | Multivariate Statistical Modeling based on Generalized Linear Models - Fahrmeir, Tutz - 1994 |

103 |
Bias reduction of maximum likelihood estimates (Corr: 95V82 p66 7
- Firth
- 1993
(Show Context)
Citation Context ... covariate-specific ηj. The VGAM package potentially offers a wide selection of models and utilities. There is much future work to do. Some useful additions to the package include: 1. Bias-reduction (=-=Firth 1993-=-) is a method for removing the O(n −1 ) bias from a maximum likelihood estimate. For a substantial class of models including GLMs it can be formulated in terms of a minor adjustment of the score vecto... |

75 | and Statistical Methods for Genetic Analysis - Lange - 1997 |

75 | Modern applied statistics with - WN, BD - 2002 |

34 | Regression Modeles for Categorical and Limited Dependent Variables. Sage: Thousand Oaks - JS |

29 | Partial proportional odds models for ordinal response variables - Peterson, Harrell - 1990 |

28 | Generalized Linear Models - JA, RWM |

26 | Tibshirani R - Hastie - 1990 |

25 | Genetic Data Analysis II: Methods for Discrete Population Genetic Data. Sinauer Associates - BS - 1996 |

23 |
Categorical Data Analysis Using the SAS System
- Stokes, Davis, et al.
(Show Context)
Citation Context ...h bewilders the non-expert user; there is little coherent overriding structure. Its proc logistic handles the multinomial logit and proportional odds models, as well as exact logistic regression (see =-=Stokes et al. 2000-=-, which is for Version 8 of SAS). The fact that the proportional odds model may be fitted by proc logistic, proc genmod and proc probit arguably leads to possible confusion rather than the making of c... |

21 | Toward A Common Framework for Statistical Analysis and Development - Imai, King, et al. - 2008 |

19 | Tibshirani RJ - TJ - 1990 |

18 | The Analysis of Ordered Categorical Data: An Overview and a Survey of Recent Developments. Sociedad de Estadistica e Investigacion Operativa - Liu, Agresti |

15 | K (2006b). “The Strucplot Framework: Visualizing Multi-way Contingency Tables with vcd
- Meyer, Zeileis, et al.
(Show Context)
Citation Context ...gresti (2005), and a manual for fitting common models found in Agresti (2002) to polytomous responses with various software is Thompson (2009). A package for visualizing categorical data in R is vcd (=-=Meyer et al. 2006-=-, 2009). 2. VGLM/VGAM overview This section summarizes the VGLM/VGAM framework with a particular emphasis toward categorical models since the classes encapsulates many multivariate response models in,... |

12 |
gam: Generalized Additive Models. R package version 1.0
- Hastie
- 2008
(Show Context)
Citation Context ... knot selection for vector spline follows the same idea as O-splines (see Wand and Ormerod 2008) in order to lower the computational cost. The usage of vgam() with smoothing is very similar to gam() (=-=Hastie 2008-=-), e.g., to fit a nonparametric proportional odds model (cf. p.179 of McCullagh and Nelder 1989) to the pneumoconiosis data one could try R> pneumo <- transform(pneumo, let = log(exposure.time)) R> fi... |

11 | Tutz D (2001) Multivariate statistical modeling based on generalized linear models - Fahrmeir |

11 |
Bias reduction in exponential family nonlinear models
- KOSMIDIS, FIRTH
- 2009
(Show Context)
Citation Context ...he O(n −1 ) bias from a maximum likelihood estimate. For a substantial class of models including GLMs it can be formulated in terms of a minor adjustment of the score vector within an IRLS algorithm (=-=Kosmidis and Firth 2009-=-). One by-product, for logistic regression, is that while the maximum likelihood estimate (MLE) can be infinite, the adjustment leads to estimates that are always finite. At present the R package brgl... |

8 |
A Simple Model of
- Firth, Lomas, et al.
- 2010
(Show Context)
Citation Context ... ≡ 1. Like brat(), one can choose a different reference group and reference value. Other R packages for the Bradley-Terry model include BradleyTerry2 by H. Turner and D. Firth (with and without ties; =-=Firth 2005-=-, 2008) and prefmod (Hatzinger 2009). 4.4. Genetic models There are quite a number of population genetic models based on the multinomial distribution, e.g., Weir (1996), Lange (2002). Table 3 lists so... |

7 | Vcd: visualizing categorical data. R package version 1.09 - Meyer, Zeileis, et al. - 2008 |

6 | Generalized Nonlinear Models in R: An Overview of the gnm Package. R package version - Turner, Firth - 2011 |

5 |
brglm: Bias reduction in binary-response GLMs. R package version 0.5-6, URL http://www.ucl.ac.uk/~ucakiko/software.html
- Kosmidis
- 2007
(Show Context)
Citation Context ...One by-product, for logistic regression, is that while the maximum likelihood estimate (MLE) can be infinite, the adjustment leads to estimates that are always finite. At present the R package brglm (=-=Kosmidis 2008-=-) implements biasreduction for a number of models. Bias-reduction might be implemented by adding an argument bred = FALSE, say, to some existing VGAM family functions. 2. Nested logit models were deve... |

5 | A Course in Categorical Data Analysis - Leonard - 2000 |

4 | Iteratively reweighted least squares for maximum likelihood estimation, and some robust and resistant alternatives - PJ - 1984 |

4 |
Zelig: Everyone’s Statistical Software. R package version 3.5, URL http://CRAN.R-project.org/package=Zelig
- Imai, King, et al.
- 2011
(Show Context)
Citation Context ...rily complicated. And xij can apply in theory to any VGLM and not just to the multinomial logit model. Imai et al. (2008) present another perspective on the xij problem with illustrations from Zelig (=-=Imai et al. 2009-=-).Thomas W. Yee 29 Using the xij argument VGAM handles variables whose values depend on ηj, (22), using the xij argument. It is assigned an S formula or a list of S formulas. Each formula, which must... |

4 | Statistical analysis of categorical data - CJ - 1999 |

3 | Letter to the editor: Ordinal regression models for epidemiologic data - Peterson - 1990 |

3 | The VGAM Package for Categorical Data Analysis - TW - 2010 |

3 | TJ: Reduced-rank vector generalized linear models. Stat Modelling 2003 - TW, Hastie |

2 | BradleyTerry: Bradley–Terry models. R package, Version 0.8–5 - Firth - 2005 |

2 | VGAM: Vector Generalized Linear and Additive Models. R package version 0.7-7, URL http://CRAN.R-project.org/package=VGAM - TW - 2008 |

1 |
Convergence Problems in Logistic Regression
- Allison
- 2004
(Show Context)
Citation Context ...nts32 The VGAM Package for Categorical Data Analysis tend to ±∞. With such data, all (to my knowledge) R implementations give warnings that are vague, if any at all, and this is rather unacceptable (=-=Allison 2004-=-). The safeBinaryRegression package (Konis 2009) overloads glm() so that a check for the existence of the MLE is made before fitting a binary response GLM. In closing, the VGAM package is continually ... |

1 | Nineteen Ways of Looking at - Altman, Jackman - 2010 |

1 | Regression and Ordered Categorical Variables - JA - 1984 |

1 | FE (2009). rms: Regression Modeling Strategies. R package version 2.1-0, URL http://CRAN.R-project.org/package=rms - Harrell |

1 | Tibshirani R, Buja A - Hastie - 1994 |

1 |
prefmod: Utilities to Fit Paired Comparison Models for Preferences. R package version 0.8-16, URL http://CRAN.R-project.org/package=prefmod
- Hatzinger
- 2009
(Show Context)
Citation Context ...se a different reference group and reference value. Other R packages for the Bradley-Terry model include BradleyTerry2 by H. Turner and D. Firth (with and without ties; Firth 2005, 2008) and prefmod (=-=Hatzinger 2009-=-). 4.4. Genetic models There are quite a number of population genetic models based on the multinomial distribution, e.g., Weir (1996), Lange (2002). Table 3 lists some VGAM family functions for such.... |

1 |
safeBinaryRegression: Safe Binary Regression. R package version 0.1-2, URL http://CRAN.R-project.org/package=safeBinaryRegression
- Konis
- 2009
(Show Context)
Citation Context ...ysis tend to ±∞. With such data, all (to my knowledge) R implementations give warnings that are vague, if any at all, and this is rather unacceptable (Allison 2004). The safeBinaryRegression package (=-=Konis 2009-=-) overloads glm() so that a check for the existence of the MLE is made before fitting a binary response GLM. In closing, the VGAM package is continually being developed, therefore some future changes ... |

1 |
gnlm: Generalized Nonlinear Regression Models. R package version 1.0, URL http://popgen.unimaas.nl/~jlindsey/rcode.html
- Lindsey
- 2007
(Show Context)
Citation Context ...ntinuation ratio model upon preprocessing). Neither polr() or lrm() appear able to fit the nonproportional odds model. There are non-CRAN packages too, such as the modeling function nordr() (in gnlm; =-=Lindsey 2007-=-), which can fit the proportional odds, continuation ratio and adjacent categories models; however it calls nlm() and2 The VGAM Package for Categorical Data Analysis Quantity Notation VGAM family fun... |

1 | MJ, Cheng A, Vander Hoorn S, Milne A, McCulloch A - MacMahon, Norton, et al. - 1995 |

1 | Analyzing Categorical Data - JS - 2003 |

1 | R (and S-PLUS) Manual to Accompany Agresti’s Categorical Data Analysis (2002), 2nd edition - LA - 2009 |

1 |
gnm: A Package for Generalized Nonlinear Models
- Turner, Firth
- 2007
(Show Context)
Citation Context ...unger than females on average. 6.3. Stereotype model We reproduce some of the analyses of Anderson (1984) regarding the progress of 101 patients with back pain using the data frame backPain from gnm (=-=Turner and Firth 2007-=-, 2009). The three prognostic variables are length of previous attack (x1 = 1, 2), pain change (x2 = 1, 2, 3) and lordosis (x3 = 1, 2). Like him, we treat these as numerical and standardize and negate... |

1 | On Semiparametric Regression with O’Sullivan Penalized Splines.” The Australian and New Zealand - MP, JT - 2008 |

1 | CJ, Yee TW - Wild - 1996 |

1 | 2010c). “VGLMs and VGAMs: An Overview for - TW |

1 | Vector Generalized Linear and Additive Extreme Value Models - TW, AG - 2007 |

1 | Association Models and Canonical Correlation in the Analysis of Crossclassifications Having Ordered Categories - LA - 1981 |