Previous: Presenting Results Up: Statistical Commands Next: Replication Procedures


Supported Models

We list here all models implemented in Zelig, organized by the nature of the dependent variable(s) to be predicted, explained, or described.

  1. Continuous Unbounded dependent variables can take any real value in the range 11#11 . While most of these models take a continuous dependent variable, Bayesian factor analysis takes multiple continuous dependent variables.

    1. "ls": The linear least-squares (see Section [*]) calculates the coefficients that minimize the sum of squared residuals. This is the usual method of computing linear regression coefficients, and returns unbiased estimates of 12#12 and 13#13 (conditional on the specified model).

    2. "normal": The Normal (see Section [*]) model computes the maximum-likelihood estimator for a Normal stochastic component and linear systematic component. The coefficients are identical to ls, but the maximum likelihood estimator for 13#13 is consistent but biased.
    3. "normal.bayes": The Bayesian Normal regression model (Section [*]) is similar to maximum likelihood Gaussian regression, but makes valid small sample inferences via draws from the exact posterior and also allows for priors.

    4. "netls": The network least squares regression (Section [*]) is similar to least squares regression for continuous-valued proximity matrix dependent variables. Proximity matrices are also known as sociomatrices, adjacency matrices, and matrix representations of directed graphs.

    5. "tobit": The tobit regression model (see Section [*]) is a Normal distribution with left-censored observations.

    6. "tobit.bayes": The Bayesian tobit distribution (see Section [*]) is a Normal distribution that has either left and/or right censored observations.

    7. "arima": Use auto-regressive, integrated, moving-average (ARIMA) models for time series data (see Section [*].

    8. "factor.bayes": The Bayesian factor analysis model (see Section [*]) estimates multiple observed continuous dependent variables as a function of latent explanatory variables.

  2. Dichotomous dependent variables consist of two discrete values, usually 14#14 .
    1. "logit": Logistic regression (see Section [*]) specifies 15#15 to be a(n inverse) logistic transformation of a linear function of a set of explanatory variables.
    2. "relogit": The rare events logistic regression option (see Section [*]) estimates the same model as the logit, but corrects for bias due to rare events (when one of the outcomes is much more prevalent than the other). It also optionally uses prior correction to correct for choice-based (case-control) sampling designs.
    3. "logit.bayes": Bayesian logistic regression (see Section [*]) is similar to maximum likelihood logistic regression, but makes valid small sample inferences via draws from the exact posterior and also allows for priors.
    4. "probit": Probit regression (see Section [*]) Specifies 15#15 to be a(n inverse) CDF normal transformation as a linear function of a set of explanatory variables.
    5. "probit.bayes": Bayesian probit regression (see Section [*]) is similar to maximum likelihood probit regression, but makes valid small sample inferences via draws from the exact posterior and also allows for priors.

    6. "netlogit": The network logistic regression (Section [*]) is similar to logistic regression for binary-valued proximity matrix dependent variables. Proximity matrices are also known as sociomatrices, adjacency matrices, and matrix representations of directed graphs.

    7. "blogit": The bivariate logistic model (see Section [*]) models 16#16 for 17#17 according to a bivariate logistic density.
    8. "bprobit": The bivariate probit model (see Section [*]) models 16#16 for 17#17 according to a bivariate normal density.
    9. "irt1d": The one-dimensional item response model (see Section [*]) takes multiple dichotomous dependent variables and models them as a function of one latent (unobserved) explanatory variable.
    10. "irtkd": The k-dimensional item response model (see Section [*]) takes multiple dichotomous dependent variables and models them as a function of 3#3 latent (unobserved) explanatory variables.
  3. Ordinal are used to model ordered, discrete dependent variables. The values of the outcome variables (such as kill, punch, tap, bump) are ordered, but the distance between any two successive categories is not known exactly. Each dependent variable may be thought of as linear, with one continuous, unobserved dependent variable observed through a mechanism that only returns the ordinal choice.
    1. "ologit": The ordinal logistic model (see Section [*]) specifies the stochastic component of the unobserved variable to be a standard logistic distribution.
    2. "oprobit": The ordinal probit distribution (see Section [*]) specifies the stochastic component of the unobserved variable to be standardized normal.
    3. "oprobit.bayes": Bayesian ordinal probit model (see Section [*]) is similar to ordinal probit regression, but makes valid small sample inferences via draws from the exact posterior and also allows for priors.
    4. "factor.ord": Bayesian ordered factor analysis (see Section [*]) models observed, ordinal dependent variables as a function of latent explanatory variables.
  4. Multinomial dependent variables are unordered, discrete categorical responses. For example, you could model an individual's choice among brands of orange juice or among candidates in an election.
    1. "mlogit": The multinomial logistic model (see Section [*]) specifies categorical responses distributed according to the multinomial stochastic component and logistic systematic component.
    2. "mlogit.bayes": Bayesian multinomial logistic regression (see Section [*]) is similar to maximum likelihood multinomial logistic regression, but makes valid small sample inferences via draws from the exact posterior and also allows for priors.
  5. Count dependent variables are non-negative integer values, such as the number of presidential vetoes or the number of photons that hit a detector.
    1. "poisson": The Poisson model (see Section [*]) specifies the expected number of events that occur in a given observation period to be an exponential function of the explanatory variables. The Poisson stochastic component has the property that, 18#18 .
    2. "poisson.bayes": Bayesian Poisson regression (see Section [*]) is similar to maximum likelihood Poisson regression, but makes valid small sample inferences via draws from the exact posterior and also allows for priors.
    3. "negbin": The negative binomial model (see Section [*]) has the same systematic component as the Poisson, but allows event counts to be over-dispersed, such that 19#19 .

  6. Continuous Bounded dependent variables that are continuous only over a certain range, usually 20#20 . In addition, some models (exponential, lognormal, and Weibull) are also censored for values greater than some censoring point, such that the dependent variable has some units fully observed and others that are only partially observed (censored).

    1. "gamma": The Gamma model (see Section [*]) for positively-valued, continuous dependent variables that are fully observed (no censoring).

    2. "exp": The exponential model (see Section [*]) for right-censored dependent variables assumes that the hazard function is constant over time. For some variables, this may be an unrealistic assumption as subjects are more or less likely to fail the longer they have been exposed to the explanatory variables.

    3. "weibull": The Weibull model (see Section [*]) for right-censored dependent variables relaxes the assumption of constant hazard by including an additional scale parameter 21#21 : If 22#22 , the risk of failure increases the longer the subject has survived; if 23#23 , the risk of failure decreases the longer the subject has survived. While zelig() estimates 21#21 by default, you may optionally fix 21#21 at any value greater than 0. Fixing 24#24 results in an exponential model.

    4. "lognorm": The log-normal model (see Section [*]) for right-censored duration dependent variables specifies the hazard function non-monotonically, with increasing hazard over part of the observation period and decreasing hazard over another.

  7. Mixed dependent variables include models that take more than one dependent variable, where the dependent variables come from two or more of categories above. (They do not need to be of a homogeneous type.)
    1. The Bayesian mixed factor analysis model, in contrast to the Bayesian factor analysis model and ordinal factor analysis model, can model both types of dependent variables as a function of latent explanatory variables.

  8. Ecological inference models estimate unobserved internal cell values given contingency tables with observed row and column marginals.
    1. ei.hier: The hierarchical EI model (see Section [*]) produces estimates for a cross-section of 25#25 tables.
    2. ei.dynamic: Quinn's dynamic Bayesian EI model (see Section [*]) estimates a dynamic Bayesian model for 25#25 tables with temporal dependence across tables.
    3. ei.RxC: The 26#26 EI model (see Section [*]) estimates a hierarchical Multinomial-Dirichlet EI model for contingency tables with more than 2 rows or columns.



Gary King 2011-11-29