Previous: Coding and Recoding Variables Up: Recoding Variables Next: Statistical Commands

Missing Data

To deal with missing values in some of your variables:

  1. You may generate multiply imputed datasets using Amelia (or other programs).
  2. You may omit missing values. Zelig models automatically apply list-wise deletion, so no action is required to run a model. To obtain the total number of observations or produce other summary statistics using the analytic dataset, you may manually omit incomplete observations. To do so, first create a data frame containing only the variables in your analysis. For example:
    > new.data <- cbind(data$dep.var, data$var1, data$var2, data$var3)
    
    The cbind() command ``column binds'' variables into a data frame. (A similar command rbind() ``row binds'' observations with the same number of variables into a data frame.) To omit missing values from this new data frame:
    > new.data <- na.omit(new.data)
    
    If you perform na.omit() on the full data frame, you risk deleting observations that are fully observed in your experimental variables, but missing values in other variables. Creating a new data frame containing only your experimental variables usually increases the number of observations retained after na.omit().



Gary King 2011-11-29