Previous: Logical Operators Up: Recoding Variables Next: Missing Data

Coding and Recoding Variables

R uses vectors of logical statements to indicate how a variable should be coded or recoded. For example, to create a new variable var3 equal to 1 if var1 6#6 var2 and 0 otherwise:

> var3 <- var1 < var2               # Creates a vector of n T/F observations.
> var3 <- as.integer(var3)          # Replaces the T/F values in `var3' with 
                                    #  1's for TRUE and 0's for FALSE.  
> var3 <- as.integer(var1 < var2)   # Combine the two steps above into one.

In addition to generating a vector of dummy variables, you can also refer to specific values using logical operators defined in Section [*]. For example:

> v1 <- var1 == 5                     # Creates a vector of T/F statements.
> var1[v1] <- 4                       # For every TRUE in `v1', replaces the 
                                      #  value in `var1' with a 4.  
> var1[var1 == 5] <- 4                # The same, in one step.
The index (inside the square brackets) can be created with reference to other variables. For example,
> var1[var2 == var3] <- 1
replaces the 4#4 th value in var1 with a 1 when the 4#4 th value in var2 equals the 4#4 th value in var3. If you use = in place of ==, however, you will replace all the values in var1 with 1's because = is another way to assign variables. Thus, the statement var2 = var3 is of course true.

Finally, you may also replace any (character, numerical, or logical) values with special values (most commonly, NA).

> var1[var1 == "don't know"] <- NA   # Replaces all "don't know"'s with NA's.

After recoding the var1 replace the old data$var1 with the recoded var1: data$var1 <- var1. You may combine the recoding and replacement procedures into one step. For example:

> data$var1[data$var1 =< 0] <- -1

Alternatively, rather than recoding just specific values in variables, you may calculate new variables from existing variables. For example,

> var3 <- var1 + 2 * var2   
> var3 <- log(var1)
After generating the new variables, use the assignment mechanism <- to insert the new variable into the data frame.

In addition to generating vectors of dummy variables, you may transform a vector into a matrix of dummy indicator variables. For example, see Section [*] to transform a vector of 3#3 unique values (with 2#2 observations in the complete vector) into a 7#7 matrix.



Gary King 2011-11-29