Previous: Example 1: Creating a Up: For-Loops Next: Example 3: Weighted regression

Example 2: Creating dummy variables by hand

You may also use a loop to create a matrix of dummy variables to append to a data frame. For example, to generate fixed effects for each state, let's say that you have mydata which contains y, x1, x2, x3, and state, with state a character variable with 50 unique values. There are three ways to create dummy variables: 1) with a built-in R command; 2) with one loop; or 3) with 2 for loops.

  1. R will create dummy variables on the fly from a single variable with distinct values.
    > z.out <- zelig(y ~ x1 + x2 + x3 + as.factor(state), 
                     data = mydata, model = "ls")
    
    This method returns 50#50 indicators for 3#3 states.

  2. Alternatively, you can use a loop to create dummy variables by hand. There are two ways to do this, but both start with the same initial commands. Using vector commands, first create an index of for the states, and initialize a matrix to hold the dummy variables:
      
    idx <- sort(unique(mydata$state))
    dummy <- matrix(NA, nrow = nrow(mydata), ncol = length(idx))
    
    Now choose between the two methods.
    1. The first method is computationally inefficient, but more intuitive for users not accustomed to vector operations. The first loop uses i as in index to loop through all the rows, and the second loop uses j to loop through all 50 values in the vector idx, which correspond to columns 1 through 50 in the matrix dummy.
      for (i in 1:nrow(mydata)) {
        for (j in 1:length(idx)) {
          if (mydata$state[i,j] == idx[j]) {
            dummy[i,j] <- 1
          }
          else {
            dummy[i,j] <- 0
          }
        }
      }
      
      Then add the new matrix of dummy variables to your data frame:
      names(dummy) <- idx
      mydata <- cbind(mydata, dummy)
      

    2. As you become more comfortable with vector operations, you can replace the double loop procedure above with one loop:
      for (j in 1:length(idx)) { 
        dummy[,j] <- as.integer(mydata$state == idx[j])
      }
      
      The single loop procedure evaluates each element in idx against the vector mydata$state. This creates a vector of 2#2 TRUE/FALSE observations, which you may transform to 1's and 0's using as.integer(). Assign the resulting vector to the appropriate column in dummy. Combine the dummy matrix with the data frame as above to complete the procedure.



Gary King 2011-11-29