Previous: Example 2: Creating dummy | Up: For-Loops | Next: Writing New Models |
Selecting the by option in zelig() partitions the data frame and then automatically loops the specified model through each partition. Suppose that mydata is a data frame with variables y, x1, x2, x3, and state, with state a factor variable with 50 unique values. Let's say that you would like to run a weighted regression where each observation is weighted by the inverse of the standard error on x1, estimated for that observation's state. In other words, we need to first estimate the model for each of the 50 states, calculate 1 / SE(x151#51 ) for each state 52#52 , and then assign these weights to each observation in mydata.
z.out <- zelig(y ~ x1 + x2 + x3, by = "state", data = mydata, model = "ls")Now z.out is a list of 50 regression outputs.
se <- array() # Initalize the empty data structure. for (i in 1:50) { # vcov() creates the variance matrix se[i] <- sqrt(vcov(z.out[[i]])[2,2]) # Since we have an intercept, the 2nd } # diagonal value corresponds to x1.
wts <- 1 / seThis vector wts has 50 values that correspond to the 50 sets of state-level regression output in z.out.
mydata$w <- NA # Initalizing the empty variable for (i in 1:50) { mydata$w[mydata$state == i] <- wts[i] }We use mydata$state as the index (inside the square brackets) to assign values to mydata$w. Thus, whenever state equals 5 for an observation, the loop assigns the fifth value in the vector wts to the variable w in mydata. If we had 500 observations in mydata, we could use this method to match each of the 500 observations to the appropriate wts.
If the states are character strings instead of integers, we can use a slightly more complex version
mydata$w <- NA idx <- sort(unique(mydata$state)) for (i in 1:length(idx) { mydata$w[mydata$state == idx[i]] <- wts[i] }
z.wtd <- zelig(y ~ x1 + x2 + x3, weights = w, data = mydata, model = "ls")